Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → WalkerLau → Accelerating Cnn With Fpga

WalkerLau / Accelerating Cnn With Fpga

Licence: other

This project accelerates CNN computation with the help of FPGA, for more than 50x speed-up compared with CPU.

Labels

convolutional-neural-networks face-recognition fpga

Projects that are alternatives of or similar to Accelerating Cnn With Fpga

Deep Learning Models for Face Detection/Recognition/Alignments, implemented in Tensorflow

Stars: ✭ 409 (+35.88%)

Mutual labels: convolutional-neural-networks, face-recognition

Perform deep neural network based face detection and recognition in the cloud (via AWS lambda) with zero model configuration or tuning.

Stars: ✭ 98 (-67.44%)

Mutual labels: convolutional-neural-networks, face-recognition

TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

Stars: ✭ 642 (+113.29%)

Mutual labels: convolutional-neural-networks, face-recognition

Intelegent lock

lock mechanism with face recognition and liveness detection

Stars: ✭ 134 (-55.48%)

Mutual labels: convolutional-neural-networks, face-recognition

FaceRank - Rank Face by CNN Model based on TensorFlow (add keras version). FaceRank-人脸打分基于 TensorFlow (新增 Keras 版本) 的 CNN 模型（QQ群：167122861）。技术支持：http://tensorflow123.com

Stars: ✭ 841 (+179.4%)

Mutual labels: convolutional-neural-networks, face-recognition

zedboard上基于FPGA+ARM的人脸识别智能监控系统。关键词：linux，zedboard，arm，fpga，人脸检测，人脸识别。

Stars: ✭ 38 (-87.38%)

Mutual labels: fpga, face-recognition

A lightweight high performance tensor algebra framework for modern C++

Stars: ✭ 280 (-6.98%)

Mutual labels: fpga

A modern PyTorch implementation of SRGAN

Stars: ✭ 289 (-3.99%)

Mutual labels: convolutional-neural-networks

PyTorch implementation of Additive Angular Margin Loss for Deep Face Recognition.

Stars: ✭ 282 (-6.31%)

Mutual labels: face-recognition

🌟 IceZUM Alhambra: an Arduino-like Open FPGA electronic board

Stars: ✭ 280 (-6.98%)

Mutual labels: fpga

S6 pcie microblaze

PCI Express DIY hacking toolkit for Xilinx SP605

Stars: ✭ 301 (+0%)

Mutual labels: fpga

Repository for basic (and not so basic) Verilog blocks with high re-use potential

Stars: ✭ 296 (-1.66%)

Mutual labels: fpga

Face recognition system for ID photos

Stars: ✭ 288 (-4.32%)

Mutual labels: face-recognition

Outdated, see new https://github.com/braindecode/braindecode

Stars: ✭ 284 (-5.65%)

Mutual labels: convolutional-neural-networks

Cherry Autonomous Racecar

Implementation of the CNN from End to End Learning for Self-Driving Cars on a Nvidia Jetson TX1 using Tensorflow and ROS

Stars: ✭ 294 (-2.33%)

Mutual labels: convolutional-neural-networks

The OpenPiton Platform

Stars: ✭ 282 (-6.31%)

Mutual labels: fpga

KiwiSDR: BeagleBone web-accessible shortwave receiver and software-defined GPS

Stars: ✭ 300 (-0.33%)

Mutual labels: fpga

An eyeglass face dataset collected and cleaned for face recognition evaluation, CCBR 2018.

Stars: ✭ 281 (-6.64%)

Mutual labels: face-recognition

Dataflow compiler for QNN inference on FPGAs

Stars: ✭ 284 (-5.65%)

Mutual labels: fpga

HAL – The Hardware Analyzer

Stars: ✭ 298 (-1%)

Mutual labels: fpga

View All Similar Projects ➔

<Super Detailed Tutorial> Accelerate CNN computation with FPGA

查看中文版教程请戳这里 ,更详细哦！
查看中文版教程请戳这里 ,更详细哦！
查看中文版教程请戳这里 ,更详细哦！

This is an original tutorial, please indicate the source when it is reprinted : https://github.com/WalkerLau/Accelerating-CNN-with-FPGA

For a GPU version of this project, please refer to: https://github.com/WalkerLau/GPU-CNN

The purpose of this project is to accelerate the processing speed of convolution neural network with the help of FPGA, which shows great advantage on parallel computation. It's also my bachelor graduation project, and I am glad to show you how my work is done step by step.

Final Performance

Let's first check out how fast our FPGA accelerator can achieve. The acceleration system only accelerates convolution layers. The screenshots below indicate the processing clock cycles on two cases where we implement our FPGA accelerator on convolution layers or not. VIPLFaceNet, a face recognition algorithm with 7 convolution layers, is adopted as an evaluation application for this project. Compared with using only a quad-core ARM Cortex A53 CPU, this CPU+FPGA acceleration system works 45x-75x faster on VIPLFaceNet.

Description & Features

VIPLFaceNet, as mentioned above, is part of SeetaFaceEngine which is an open source face recognition engine developed by Visual Information Processing and Learning (VIPL) group, Institute of Computing Technology, Chinese Academy of Sciences.

This project is developed on Xilinx SDSOC. It is a very efficient embedded development tool for individual developers or small teams. With the help of SDSOC, you can program your FPGA hardware even with little knowledge about HDL.

Below are some features of the acceleration system:

Easy Transplantation. SDSOC automatically translates C/C++ to HDL and then creates FPGA bitstream. So, you can easily migrate this system to other CNN algorithms, especially to those written in C/C++, by just adjusting the accelerator structure ( such as ifmap size, stride, filter size, etc) which can be seen inside my source code so that it can fit in different convolution layers.
Good Performance. This acceleration system includes a bunch of optimization strategies as listed below.
- ifmap volume reuse architecture
- convert data to lower precision
- 16-channel parallel processing unit & adder tree
- pipline
- on-chip BRAM partition & BRAM's cross-layer sharing
- multi-layer acceleration strategy

What should I prepare before getting started?

Hardware
- Xilinx Ultrascale+ MPSOC ZCU102 ( also works on ZCU104 or other Xilinx devices, depending on your performance needs )
Software
- Ubuntu 16.04 ( For installing and running SDSOC. The acceleration system requires an embedded Linux OS, which means all development work should be done on a Linux host machine and under Linux environment )
- Xilinx SDSOC 2018.2 ( click -> SDSOC installation and configuration tutorial (Chinese). It's important to properly install and configure SDSOC before further operations, so it's strongly recommended to glance over Xilinx's official document UG1294 )
- Xilinx reVISION platform ( The main reason for installing reVISION platform is to use xfopencv library, as SeetaFace uses OpenCV to load and preprocess images. For more information about reVISION platform and xfopencv configuration, please check reVISION-Getting-Started-Guide and xfopencv tutorial )
- [ optional, but recommended ] CodeBlocks ( for off-board debugging )
- [ optional, but recommended ] OpenCV 2.4.13.6 ( for off-board debugging )
Some basic knowledge
- Please be sure to glance over SDSOC Tutorials before going deeper. That tutorial is a very good guidance which helps you understand some basic operations of SDSOC in a very efficient way.
- Basic C/C++ programming skills.

Installation

First, download this repository.
Create an empty SDSOC project, be sure to select reVISION platform if you have installed it.
Adjusting C/C++ Build Options
Add all the source files of src folder to the newly created SDSOC project. By the way, most source files remain unchanged as in SeetaFaceEngine. You can jump to conv_net.cpp and view the FPGA accelerated codes.
Find out convolute1.cpp in project explorer and expand it. Right click on the green dot convolute1, then click on "Toggle HW/SW".

Again, find out math_functions.cpp and toggle matrix_procuct. Note that Toggle HW/SW is to label the function as a Hardware Function which runs on FPGA after synthesis.
Select Generate SD card image in Application Project Setting window and then build the project. This process will take 1~3 hours, depending on your computer's performance.
After building, navigate to the folder with the same name as your build environment in the SDSOC project file directory. And then find the sd_card folder, copy all the files inside to the SD card root directory.
Open the model folder of this repository and extract the two compressed pakages inside. After that you will get a file named seeta_fr_v1.0.bin of about 110 MB. make sure this file is under the root of model folder.
Copy two folders, model and data, to the root directory of SD card.
Configure UART settings ( as mentioned in SDSOC Tutorials ) and run the application on board. Note that the executable file Seeta-Accel-Test.elf locates in /media/card and you should navigate to the right place to run it.

Off-board Debug

We have introduced the installation process of this project. All the codes mentioned above will be executed on FPGA evaluation board. But when you want to make changes to the code or even migrate it to other algorithms, you might need to do some off-board debugging before moving on-board. Note that off-board debugging should also be done under Linux environment.

Download OpenCV 2.4.13.6, and install it ( OpenCV installation tutorial )
Install Codeblocks, create an empty project, and then configure OpenCV for the build environment.
Select c++11 standard support in build option.
Copy the two .cpp files under the off-board debug folder of this repository to the src folder. Overwrite the original files with the same name.
Import the files in src to codeblocks project.
Build and run the project.

Acknowledgement

I would like to express my special thanks to my teachers, Shulong WANG and Quanxue GAO of Xidian University, for their support on this project.

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 301

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗