0% found this document useful (0 votes)

9 views

NeuralNetworkforReal-TimeObjectDetectiononFPGA

Uploaded by

Thiện Nguyễn Minh

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

NeuralNetworkforReal-TimeObjectDetectiononFPGA

Uploaded by

Thiện Nguyễn Minh

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/352338213

Neural Network for Real-Time Object Detection on FPGA

Conference Paper · May 2021

DOI: 10.1109/ICIEAM51226.2021.9446384

CITATIONS READS
10 808

3 authors, including:

Edward Rzaev Aleksandr Amerikanov

National Research University Higher School of Economics National Research University Higher School of Economics
5 PUBLICATIONS 16 CITATIONS 16 PUBLICATIONS 96 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Edward Rzaev on 17 October 2021.

The user has requested enhancement of the downloaded file.

2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)

Neural Network for Real-Time Object Detection on

FPGA
Edward Rzaev Anton Khanaev Aleksandr Amerikanov
HSE University HSE University HSE University
Moscow, Russian Federation Moscow, Russian Federation Moscow, Russian Federation
[email protected] [email protected] [email protected]

Abstract—Object detection is one of the most active research of the De10-Nano board makes it possible to integrate it in
and application areas of neural networks. In this article we various embedded systems.
combine FPGA and neural networks technologies to solve the
real-time object recognition problem. The article discusses the The developed neural network is able to determine the
integration of the YOLOv3 neural network on the DE10-Nano boundaries of identified objects. Due to the ability of the HPS
FPGA. Slightly worse indicators of the main metrics (mAP, FPS, core to locally implement powerful data processing algorithms
inference time) when operating a neural network on a De10- and parallelize their execution at the hardware level, for stable
Nano board in comparison with more expensive solutions based operation of the robot, it was decided to create a server that
on GPUs, are offset by differences in the cost and dimensions of will perform this computer vision task. It also gives an
the FPGA board used. Based on the results of the study of opportunity to combine several neural networks on one
various methods for converting neural networks to FPGA, it was platform, for example, to be used in conjunction with a neural
concluded that this architecture is applicable for solving network to recognize speech commands [2]. Implementation
problems of detecting objects on a video stream in real time. on FPGA most accurately conveys the parallel architecture of
neural layers and provides the flexibility to reconfigure the
Keywords—FPGA, neural networks, YOLOv3, object detection, entire neural network and its components – artificial neurons.
object recognition, CNN In addition, the configuration of FPGA-based neural networks
is easy to change.
I. INTRODUCTION
So, the main goal of this project is multiclass recognition of
The meaning of this work is to make a smart system that objects on the FPGA. Possibilities of application vary
works in real time and is able to analyze the surrounding space. depending on the requirements and desires of the customer.
Using the example of this project, we want to demonstrate the Thus, changing the target data, the system adapts to the
possibilities of using FPGAs for processing a video data solution of the task without changing the hardware basis.
stream. We offer a lightweight neural network in HDL Examples of tasks: institution security; counting the number of
implementation, which can be used to solve a wide range of people in a queue, counting cars in a stream, detecting non-
tasks, for example, to detect and recognize people, animals, standard behavior of people in public places, detecting animals
vehicles and other objects based on the computer vision and birds in dangerous places, etc. In light of recent events,
algorithm. To solve the problem, we use a board with a chip of proposals for solving problems of identifying coronavirus in
the Cyclone V family (De10-Nano). The FPGA device allows potentially infected people based on x-ray images of their
to parallelize all the necessary calculations, thereby fully respiratory tract or analysis of data from thermal imagers in
utilizing all its hardware resources [1]. It is also worth adding public places will also be relevant. With the global placement
the low power consumption of the FPGA with its high of cameras in a particular country, it is possible to search for
performance. Because of this, FPGAs are an excellent tool for wanted people. In addition, this development may be useful for
solving these kinds of problems for embedded systems. production. Intelligent video surveillance systems are able to
Significant advantages of De10-Nano are its low price and recognize in advance signs of an impending accident in a
the presence of an ARM core, which allows to reduce the factory or warehouse. Thus, it allows you to correct the causes
development time of the project due to the possibility of of the accident before its immediate occurrence.
connecting peripherals and controlling the board at a higher Based on all of the above, it can be concluded that success
level. Thus, most of the time can be devoted directly to the in solving problems that affect the detection of objects in real
development and testing of the neural network. In addition, the time is limited only by the collection of data for a specific task.
De10-Nano consumes significantly less power than, for If the necessary data is available, then it is possible to train the
example, Nvidia video cards, which are inferior to FPGAs in neural network. The necessary settings can be adjusted by
terms of computing power per unit of electricity. changing the hyperparameters.
For processing images in real time, a computing base was The result of the project is a FPGA board that recognizes
chosen, which has such advantages as low power consumption the surrounding space from the camera, the output of the
and high speed of work with information, which makes it results is displayed on the laptop.
possible to use a neural network. Also, the relatively small size

2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)
II. RELATED WORKS Below is a table of comparisons of YOLO state-of-the-art
Thanks to the use of a small resource-intensive (SoTA) neural networks. It can be seen that the Titan X
infrastructure YOLO made it possible to use powerful devices processes images at a speed of 40-90 frames per second, with
in real time with a camera using a processor [3] and a GPU. It MAP (mean Average Precision) indicators for Visual Object
uses a reduced number of layers and can significantly increase Classes (VOC) of 2007 78.6% and MAP 48.1% for COCO
the speed of the neural network. test-dev.

Many neural networks designed for preliminary detection

TABLE I. COMPARISON OF NEURAL NETWORKS ON VARIOUS DATA ON
of objects in an image modify classifiers or localizers to THE TITAN X GRAPHICS CARD.
perform detection. They apply the model to the image at
multiple locations and scales. Areas with a high image score Model Train mAP FPS
are considered detections.
Old YOLO VOC 2007+2012 63.4 45
The article [4] provides a comparison of different meta-
architectures, which reflects the advantage of YOLOv3 in SSD300 VOC 2007+2012 74.3 46
comparison with analogs. The table II in the article shows that SSD500 VOC 2007+2012 76.8 19
YOLOv3 with the same or better recognition quality (metric
mAP@50) has a significantly shorter image processing time YOLOv2 VOC 2007+2012 76.8 67
(about 4-5 times). For the task of detecting objects on a video YOLOv2 544x544 VOC 2007+2012 78.6 40
stream, high image processing speed is one of the key
advantages of the YOLOv3 architecture compared to other Tiny YOLO VOC 2007+2012 57.1 207
architectures. SSD300 COCO trainval 41.2 46
In proposed neural network, a completely different SSD500 COCO trainval 46.5 19
approach is used. In this case, one neural network is used for
YOLOv2 608x608 COCO trainval 48.1 40
the complete image. This network divides the image into
regions and predicts bounding boxes and probabilities for each Tiny YOLO COCO trainval - 200
region. These bounding boxes are weighted by the predicted
probabilities. This model has several advantages over III. METHODOLOGY
classifier-based systems. The neural network processes the The diagram of connected devices is represented in Figure
entire image during testing, so its predictions are based on the 2.
part of the image. It also makes predictions with a single
network estimate, as opposed to systems like the Region-based
Convolutional Network (R-CNN), which require thousands of
estimates for a single image. All of the above combined makes
it extremely fast, over 1000 times faster than R-CNN and 100
times faster than Fast R-CNN [5].
Figure 1 graphically depicts the process of bounding boxes
building.

Fig. 1. The process of detecting objects in the image [6].

Fig. 2. The block diagram of the project.

The article [7] introduces the REQ-YOLO architecture, YOLOv3 is a rather heavyweight neural network and
which is based on the YOLO architecture. In fact, REQ-YOLO requires a large amount of video memory and computing
is a highly compressed version of the YOLO architecture for resources to be able to recognize objects with high accuracy
improving FPGA performance. A special feature of REQ- and quality. Therefore, for the limited resources of De10-Nano,
YOLO is its simplicity at the software and hardware levels it was decided to use a lighter version - Tiny YOLOv3.
when detecting objects. In both works, quantization of weights Reducing the resolution of the input image, reducing the layers
is used, which makes it possible to significantly reduce the both in terms of feature selection and in terms of object
number of calculations, and, therefore, the amount of memory classification and regression of the location of objects made it
used by the neural network. Unlike the work [7], our project possible to significantly facilitate the neural network, however,
provides an accurate assessment of the quality of recognition the quality of object detection also deteriorated.
and the speed of image processing by a neural network.
2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)
TABLE II. TINY YOLOV3 ARCHITECTURE [8]. it became possible to significantly accelerate the neural
network and reduce the amount of energy consumed, albeit by
Layer Type Filters Size/Stride Input Output
reducing it by 15– 20% [9] accuracy of the model.
0 Convolutional 16 3×3/1 416 × 416 × 3 416 × 416 × 16 As a dataset for training, a set of images
1 Maxpool 2×2/2 416 × 416 × 16 208 × 208 × 16
OpenImagesV4 [10] from Google was selected. This is an open
dataset in which there are almost 2 million tagged images with
2 Convolutional 32 3×3/1 208 × 208 × 16 208 × 208 × 32 a hierarchical structure of classes (their number is about 600).
To train the neural network, a data subset with 18 classes was
3 Maxpool 2×2/2 208 × 208 × 32 104 × 104 × 32 used, including classes such as people, various types of
4 Convolutional 64 3×3/1 104 × 104 × 32 104 × 104 × 64
furniture, transportation, and various office, kitchen and other
accessories. In total, the dataset has about 28,600 drawings.
5 Maxpool 2×2/2 104 × 104 × 64 52 × 52 × 64 They were downloaded using the OIDv4 Toolkit [11].
6 Convolutional 128 3×3/1 52 × 52 × 64 52 × 52 × 128 To train the neural network, the BlueOil [12] framework is
used, which allows you to solve various machine learning
7 Maxpool 2×2/2 52 × 52 × 128 26 × 26 × 128 problems using FPGAs.
8 Convolutional 256 3×3/1 26 × 26 × 128 26 × 26 × 256 The first step is to prepare a server with a GPU for training
a neural network. It worth mentioning that newer generation of
9 Maxpool 2×2/1 26 × 26 × 256 13 × 13 × 256 Nvidia GPUs are prefered for solving this problem, since the
10 Convolutional 512 3×3/1 13 × 13 × 256 13 × 13 × 512 vast majority of libraries for developing and training neural
networks are written specifically for CUDA kernels in
11 Maxpool 1×1/1 13 × 13 × 512 13 × 13 × 512 languages C or C++. The server can be either a local computer
or a remote device with a Linux operating system on board.
12 Convolutional 1024 3×3/1 13 × 13 × 512 13 × 13 × 1024 The development of the project was carried out on the Ubuntu
13 Convolutional 256 1×1/1 13 × 13 × 1024 13 × 13 × 256 18.04 distribution. Also, GPU drivers higher than 410 are
needed. It is recommended to have about 50 GB of free space
14 Convolutional 512 3×3/1 13 × 13 × 256 13 × 13 × 512 for the development. Docker must be installed on the server to
get started with the project. Used hardware for training neural
15 Convolutional 255 1×1/1 13 × 13 × 512 13 × 13 × 255
network is a local computer with an Nvidia GeForce 940MX
16 YOLO video card with 2 GB of video memory.

17 Route 13
The ability to develop and train neural networks bypassing
the process of creating an environment in which many of all
18 Convolutional 128 1×1/1 13 × 13 × 256 13 × 13 × 256 software components do not conflict with each other due to the
difference in the versions of the modules and libraries used and
19 Up-sampling 2×2/1 13 × 13 × 128 26 × 26 × 128 the portability of developments in general is especially
20 Route 19 8
convenient. That is why for the successful operation of the
entire project it was decided to create a Docker Container. It
21 Convolutional 256 3×3/1 13 × 13 × 384 13 × 13 × 256 allows to reproduce project even on a completely new device
or server. The developed Docker Container is built based on
22 Convolutional 255 1×1/1 13 × 13 × 256 13 × 13 × 256 the Linux operating system, the Ubuntu 18.04 distribution kit,
23 YOLO
and contains both hardware and software modules necessary
for project development.

There are many different factors to take into account during The architecture of the trained neural network using Blueoil
the training of a neural network. The number of classes, the was converted into a binary file for the firmware of the DE10-
specifics of the problem, the size of the bounding rectangles Nano board. A configuration file with neural network weights
and others are to be considered. Data-independent factors also was added to it. After that, the SD Card image was modified to
have a big impact. For example, the choice of the correct provide Blueoil support. Then the received files were added to
training step, the algorithm for calculating the backpropagation Ubuntu on the board, and the board was reprogrammed and the
of the error, the number of processed pictures per one update of necessary packages for Python were installed on the FPGA
the weights. board. This made it possible, with proper connection of the
camera and the rest of the periphery, to launch a neural
Quantization was applied to the layers of the neural network to detect objects on the DE10-Nano.
network, that is, a reduction in the number of bits that are
allocated to represent one network parameter. So, instead of Based on the experience of working with the DE10-Nano
using 32 bits for one floating-point number, 8 bits are allocated board, it was decided to develop a cooling system for the
for one parameter. Since the model weights occupy board's chip. This board has an industrial Cyclone V chip that
approximately 2–3 times less RAM space and the calculations requires additional cooling or overheating protection, which
themselves use approximately 2–2.5 times less execution time, the developers did not implement when creating the board.
2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)
Thus, it was decided to use the development from our previous The horizontal axis represents the number of pictures. The
project [13], which solves the problem. vertical axis shows input resolution of the picture. The rest of a
training parameters were the same in each experiment.
As a result, the most optimal value was the resolution of
128x128 pixels, as it is excellent in terms of learning speed and
preserving the maximum amount of information during data
preprocessing.
The training step was 0.003, which decreased by a factor of
10 every 1000 updates of the parameters. There were 20,000
iterations in total.

B. Converting a model as a firmware to an FPGA

The process of transferring developed and trained neural
Fig. 3. Board cover with cooler. network to De10-Nano can be conditionally divided into 2
parts. The first is a computational graph, in which the entire
In the research, the use of De10-Nano gives on average the object recognition algorithm is written, and the second is the
following indicators on the OpenImagesV4 sample from parameters of the neural network involved in the calculations.
Google: FPS ≈ 28-33; mAP ≈ 29.1%. Based on these data, we It is advisable to load the model parameters once from the main
can conclude that the De10-Nano copes well with the task memory of the board, while the computational graph can be
when compared with the top-class Titan X video card, the cost represented as a binary FPGA firmware file.
of which is more than 10 times higher than the cost of the used
De10-Nano board. For computations of the neural network, the De10-Nano
crystal is used directly, while the ARM core is utilized for the
The computing power of De10-Nano [14] is aimed at high-level control of the board. Also, it deals with connecting
solving the problem: and configuring peripherals. It is possible to update the board
configuration directly from terminal without powering it off
• Cyclone V FPGA: 0.16 GFLOPS;
using Bash and Python scripts.
• Dual Core ARM Cortex-A9 MPCore: 2 GFLOPS.
C. Testing and debugging the project in real time
Table 2 shows the data on the use of the computing
resources of the board. During the development of the project, its debugging and
testing were successfully carried out. The quality of the mAP
metric = 29.4%. The average FPS fluctuates around 30 frames
TABLE III. RESOURCES USED BY DE10-NANO. per second, which makes it possible to successfully analyze the
Estimates Resource Usage Summary environment in real time. With the input resolution of the image
Resource Usage 224×224, the FPS dropped to 10 frames per second. Therefore, there
Logic utilization 59% was no point in taking less than the resolution of 128×128 pixels
ALUTs 39% since the quality of object recognition drops with faster
Dedicated logic registers 25% rendering of frames to 19.7%.
Memory blocks 57%
DSP blocks 43%
V. RESULTS
IV. EXPERIMENTAL RESULTS AND ANALYSIS This project shows that the Cyclone V chips are able to
handle the processing of a 128×128 video stream by a neural
A. Hyperparameter Tuning network in real time.
The neural network was trained on a GPU with 2 GB of
The developed chip cooling system in one of our previous
GPU memory. On average, it took about 8 hours to train the
projects [13] was perfect for improving the chip's performance
model. The capacity of the GPU memory was enough for a
when processing a video stream. This system allows to get
maximum of 4 pictures to update the model weights. It did not
stable FPS indicators over a long period of time while using
make sense to take a smaller amount, since the training time of
FPGA in active mode.
the neural network will increase significantly, and gradient
computation becomes less stable. However, choosing the right This project is notable for the fact that it is much cheaper
input resolution of the pictures got the following result: than similar solutions [15]–[17] using expensive video cards,
but at the same time important indicators (mAP, FPS, inference
TABLE IV. ERROR FUNCTION VALUE FOR DIFFERENT INPUT DATA
time) are acceptable within the framework of the object
FORMAT detection problem, that is, this project has an applied character
and is workable in real-life tasks.
1 2 4
96х96 3.28 2.95 2.8 As a result of the project, the following metrics were
128х128 3.1 2.78 2.6 achieved: mAP = 29.4% and FPS in the range [28.3, 33.4].
168х168 2.91 2.72 MemoryError
2021 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)
Below are examples of how the neural network operates on VI. CONCLUSIONS
the De10-Nano board. In this paper, the implementation of a lightweight neural
network in HDL is presented. It is applicable to solving a wide
range of tasks, for example, such as problems of detecting and
recognizing people, animals, vehicles and other objects, based
on computer vision algorithms. This development works on the
De10-Nano FPGA board and has good FPS, mAP metric,
which make it efficient and applicable in various tasks.

REFERENCES
[1] T. V. Huynh, “Deep neural network accelerator based on FPGA,” 2017
4th NAFOSTED Conf. on Information and Computer Science, pp. 254–
257, 2017. DOI: 10.1109/NAFOSTED.2017.8108073.
[2] R. A. Solovyev, “Deep Learning Approaches for Understanding Simple
Speech Commands,” 2020 IEEE 40th Int. Conf. on Electronics and
Nanotechnology, pp. 688–693, 2020. DOI:
Fig. 4. An example of a demonstration of the operation of a neural network. 10.1109/ELNANO50318.2020.9088863.
[3] M. B. Ullah, “CPU Based YOLO: A Real Time Object Detection
Algorithm,” 2020 IEEE Region 10 Symposium, pp. 552–555, 2020.
DOI: 10.1109/TENSYMP50017.2020.9230778.
[4] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,”
Tech Rep., pp. 1–6, 2018.
[5] R. Girshick, Fast R-CNN, 2015.
[6] YOLO: Real-Time Object Detection.
[7] [C. Ding, S. Wang, N. Liu, K. Xu, Y. Wang, and Y. Liang, “REQ-
YOLO: A resource-aware, efficient quantization framework for object
detection on FPGAS,” FPGA 2019 – Proc. 2019 ACM/SIGDA Int.
Symp. Field-Programmable Gate Arrays, pp. 33–42, 2019. DOI:
10.1145/3289602.3293904.
[8] W. He, Z. Huang, Z. Wei, C. Li, and B. Guo, “TF-YOLO: An improved
incremental network for real-time object detection,” Appl. Sci., vol. 9,
no. 16, 2019. DOI: 10.3390/app9163225.
[9] B. Jacob, Quantization and Training of Neural Networks for Efficient
Integer-Arithmetic-Only Inference, 2018.
[10] A. Kuznetsova, “The Open Images Dataset V4: Unified image
classification, object detection, and visual relationship detection at
scale,” Int. J. Comput. Vis., vol. 128, no. 7, pp. 1956–1981, 2018. DOI:
Fig. 5. Different examples of the operation of a neural network. 10.1007/s11263-020-01316-z.
[11] GitHub - EscVM/OIDv4_ToolKit: Download and visualize single or
The video stream from the camera is used as input to the multiple classes from the huge Open Images v4 dataset.
neural network. [12] GitHub - blue-oil/blueoil: Bring Deep Learning to small devices.
[13] InnovateFPGA|EMEA|EM029 - Anthropomorphic robot on FPGA.
To demonstrate the operation of the neural network, the
[14] N. Rajovic, L. Vilanova, C. Villavieja, N. Puzovic, and A. Ramirez,
processed video stream is transmitted via SSH to the working “The low power architecture approach towards exascale computing,” J.
machine in real time. Comput. Sci., vol. 4, no. 6, pp. 439–443, 2013. DOI:
10.1016/j.jocs.2013.01.002.
Depending on the initial set of classes of objects to be
[15] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal
detected, the recognition quality and the speed of the Speed and Accuracy of Object Detection,” arXiv, 2020.
algorithms change. An increase in the complexity of the [16] W. Liu, “SSD: Single Shot MultiBox Detector,” Lect. Notes Comput.
detection task leads to a deterioration in its characteristics. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes
Bioinformatics), vol. 9905 LNCS, pp. 21–37, 2015. DOI: 10.1007/978-
3-319-46448-0_2.
[17] S. Xie, R. Girshick, P. Dollár, Z. Tu, K. He, and U. San Diego,
Aggregated Residual Transformations for Deep Neural Networks.

View publication stats

FPGA-SoC Implementation of YOLOv4 For Flying-Object Detection
No ratings yet
FPGA-SoC Implementation of YOLOv4 For Flying-Object Detection
20 pages
Software-Defined Networks: A Systems Approach
From Everand
Software-Defined Networks: A Systems Approach
Larry Peterson
5/5 (1)
Systematic Analysis of FPGA-based Hardware Acceler
No ratings yet
Systematic Analysis of FPGA-based Hardware Acceler
9 pages
An Efficient CNN Accelerator Using Inter-Frame Data Reuse of Videos On FPGAs
No ratings yet
An Efficient CNN Accelerator Using Inter-Frame Data Reuse of Videos On FPGAs
14 pages
Real-Time Ssdlite Object Detection On Fpga
No ratings yet
Real-Time Ssdlite Object Detection On Fpga
14 pages
A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN For Object Detection
No ratings yet
A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN For Object Detection
13 pages
Dectec TF Pga Report 2
No ratings yet
Dectec TF Pga Report 2
9 pages
Fixed-Point CNN For FPGA
No ratings yet
Fixed-Point CNN For FPGA
7 pages
JTECCNN
No ratings yet
JTECCNN
6 pages
A High-Throughput and Power-Efficient FPGA Implementation of Yolo CNN For Object Detection
No ratings yet
A High-Throughput and Power-Efficient FPGA Implementation of Yolo CNN For Object Detection
13 pages
E50 Final Report
No ratings yet
E50 Final Report
39 pages
High Performance FPGA Based CNN Accelerator
No ratings yet
High Performance FPGA Based CNN Accelerator
4 pages
02-92
No ratings yet
02-92
15 pages
FPGA-Based_Real-Time_Object_Detection_and_Classification_System_Using_YOLO_for_Edge_Computing
No ratings yet
FPGA-Based_Real-Time_Object_Detection_and_Classification_System_Using_YOLO_for_Edge_Computing
11 pages
Development and Implementation of Parameterized FPGA-Based General Purpose Neural Networks For Online Applications
No ratings yet
Development and Implementation of Parameterized FPGA-Based General Purpose Neural Networks For Online Applications
12 pages
Telfor Paper 4172
No ratings yet
Telfor Paper 4172
5 pages
Sensors 11 02282
No ratings yet
Sensors 11 02282
23 pages
Bonnard Et Al-2020-On Building A CNN-based Multi-View Smart Camera For Real-Time Object Detection
No ratings yet
Bonnard Et Al-2020-On Building A CNN-based Multi-View Smart Camera For Real-Time Object Detection
33 pages
An Implementation of Convolutional Neural Networks
No ratings yet
An Implementation of Convolutional Neural Networks
23 pages
Design of Fpga Based General Purpose Neural Network: MR Prashant D.Deotaleproflalit Dole
No ratings yet
Design of Fpga Based General Purpose Neural Network: MR Prashant D.Deotaleproflalit Dole
5 pages
FPGA Design for Object Detection
No ratings yet
FPGA Design for Object Detection
12 pages
Sensors 19 00350
No ratings yet
Sensors 19 00350
14 pages
Cam and Yolo
No ratings yet
Cam and Yolo
13 pages
New Dlau
No ratings yet
New Dlau
52 pages
paper45
No ratings yet
paper45
7 pages
A Directional-Edge-Based Real-Time Object Tracking System Employing Multiple Candidate-Location Generation
No ratings yet
A Directional-Edge-Based Real-Time Object Tracking System Employing Multiple Candidate-Location Generation
15 pages
Yang 2022 J. Phys. Conf. Ser. 2189 012014
No ratings yet
Yang 2022 J. Phys. Conf. Ser. 2189 012014
8 pages
An FPGA Based Generic Framework For High Speed Sum of Absolute Difference Implementation
No ratings yet
An FPGA Based Generic Framework For High Speed Sum of Absolute Difference Implementation
24 pages
Pedestrian Detection System Based On Deep Learning
No ratings yet
Pedestrian Detection System Based On Deep Learning
5 pages
2011 Undergraduate Thesis Topics
No ratings yet
2011 Undergraduate Thesis Topics
4 pages
Deploying A Web-Based Electroencephalography Data Analysis Virtual Laboratory
No ratings yet
Deploying A Web-Based Electroencephalography Data Analysis Virtual Laboratory
7 pages
Fast Algorithms For Spiking Neural Network Simulation With Fpgas
No ratings yet
Fast Algorithms For Spiking Neural Network Simulation With Fpgas
34 pages
Design and Implementation of Real Time Data Acquis
No ratings yet
Design and Implementation of Real Time Data Acquis
8 pages
OICT2016 Abstract PDF
No ratings yet
OICT2016 Abstract PDF
2 pages
Scaled-Yolov4: Scaling Cross Stage Partial Network: June 2021
No ratings yet
Scaled-Yolov4: Scaling Cross Stage Partial Network: June 2021
11 pages
WSIM: A Software Platform To Simulate All-Optical Security Operations
No ratings yet
WSIM: A Software Platform To Simulate All-Optical Security Operations
7 pages
Implementation of Deep Neural Networks Learning On Unmanned Aerial Vehicle Based Remote-Sensing
No ratings yet
Implementation of Deep Neural Networks Learning On Unmanned Aerial Vehicle Based Remote-Sensing
7 pages
Research Article: A Model For Surface Defect Detection of Industrial Products Based On Attention Augmentation
No ratings yet
Research Article: A Model For Surface Defect Detection of Industrial Products Based On Attention Augmentation
12 pages
Applsci 13 04144 v2
No ratings yet
Applsci 13 04144 v2
26 pages
Research Article: System Architecture For Real-Time Face Detection On Analog Video Camera
No ratings yet
Research Article: System Architecture For Real-Time Face Detection On Analog Video Camera
11 pages
High-Performance Acceleration of 2-D and 3-D CNNs On FPGAs Using Static Block Floating Point
No ratings yet
High-Performance Acceleration of 2-D and 3-D CNNs On FPGAs Using Static Block Floating Point
15 pages
Hasan Baig (Resume Oct-12)
No ratings yet
Hasan Baig (Resume Oct-12)
4 pages
Real Time Monitoring of 3 Axis Accelerometer Using
No ratings yet
Real Time Monitoring of 3 Axis Accelerometer Using
7 pages
Finn
No ratings yet
Finn
10 pages
An_Intelligent_Real-Time_Object_Detection_System_o
No ratings yet
An_Intelligent_Real-Time_Object_Detection_System_o
15 pages
Implementation of FPGA-based Accelerator For CNN
No ratings yet
Implementation of FPGA-based Accelerator For CNN
7 pages
Design and implementation of deep neural network hardware chip and its performance analysis
No ratings yet
Design and implementation of deep neural network hardware chip and its performance analysis
10 pages
Efficient Lightweight Residual Network For Real-Time Road Semantic Segmentation
No ratings yet
Efficient Lightweight Residual Network For Real-Time Road Semantic Segmentation
8 pages
Yolo Vs RCNN
No ratings yet
Yolo Vs RCNN
5 pages
Automatic Number Plate Detection System and Automating The Fine Generation Using YOLO-v3
No ratings yet
Automatic Number Plate Detection System and Automating The Fine Generation Using YOLO-v3
8 pages
Electronics: FPGA Implementation For CNN-Based Optical Remote Sensing Object Detection
No ratings yet
Electronics: FPGA Implementation For CNN-Based Optical Remote Sensing Object Detection
24 pages
8 IV April 2020
No ratings yet
8 IV April 2020
7 pages
Zhang 2019 IOP Conf. Ser. Mater. Sci. Eng. 612 042052
No ratings yet
Zhang 2019 IOP Conf. Ser. Mater. Sci. Eng. 612 042052
8 pages
Sensors 23 02208 v2
No ratings yet
Sensors 23 02208 v2
26 pages
Fpga Thesis PDF
100% (2)
Fpga Thesis PDF
7 pages
BNN in FPGA
No ratings yet
BNN in FPGA
15 pages
Apply Yolov4-Tiny On An FPGA-Based Accelerator of
No ratings yet
Apply Yolov4-Tiny On An FPGA-Based Accelerator of
9 pages
Instruction Set Extension of A RiscV Based SoC For Driver Drowsiness Detection
No ratings yet
Instruction Set Extension of A RiscV Based SoC For Driver Drowsiness Detection
12 pages
Design of Fault Tolerant Algorithm For Network On Chip Router Using Field Programmable Gate Array
No ratings yet
Design of Fault Tolerant Algorithm For Network On Chip Router Using Field Programmable Gate Array
8 pages
Comparison Study and Analysis of Implementing Activation Function of Machine Learning in MATLAB and FPGA
No ratings yet
Comparison Study and Analysis of Implementing Activation Function of Machine Learning in MATLAB and FPGA
10 pages
ping_pong_buffer
No ratings yet
ping_pong_buffer
8 pages
FPGAConcept-FPGABasic-QuizAnswer
No ratings yet
FPGAConcept-FPGABasic-QuizAnswer
1 page
Avalon Bus Specification: Reference Manual
No ratings yet
Avalon Bus Specification: Reference Manual
106 pages
Asconv12 Nist
No ratings yet
Asconv12 Nist
52 pages
DataType-Synthesis-Quiz_Solution
No ratings yet
DataType-Synthesis-Quiz_Solution
1 page
L04 SDLC Embedded
No ratings yet
L04 SDLC Embedded
49 pages
Verification Plan Fifo
No ratings yet
Verification Plan Fifo
1 page
Presentation TruyenThongDuLieu Group3-1
No ratings yet
Presentation TruyenThongDuLieu Group3-1
30 pages
P2 - Python FFMPEG (3)
No ratings yet
P2 - Python FFMPEG (3)
8 pages
L09_Safety_Security
No ratings yet
L09_Safety_Security
32 pages
(123doc) Trac Nghiem Thiet Ke Vi Mach So Voi HDL SPKT Chuong 1
No ratings yet
(123doc) Trac Nghiem Thiet Ke Vi Mach So Voi HDL SPKT Chuong 1
16 pages
S32K Tool Tutorial Board GuidleLine
No ratings yet
S32K Tool Tutorial Board GuidleLine
23 pages
Introduction To Subnetting - GeeksforGeeks
No ratings yet
Introduction To Subnetting - GeeksforGeeks
5 pages
Parts 140, 170, 220 Xi3+
No ratings yet
Parts 140, 170, 220 Xi3+
14 pages
Lecture - 1 Automatics
No ratings yet
Lecture - 1 Automatics
24 pages
Egress Modelling of Pedestrians for the Design of Contemporary Stadia John Gales download pdf
100% (1)
Egress Modelling of Pedestrians for the Design of Contemporary Stadia John Gales download pdf
47 pages
Sewage Cleaning System.
No ratings yet
Sewage Cleaning System.
6 pages
Unisphere Product Guide by EMC
No ratings yet
Unisphere Product Guide by EMC
528 pages
Tax Invoice: Zomato Media Private Limited Address: Pan No
No ratings yet
Tax Invoice: Zomato Media Private Limited Address: Pan No
1 page
Data Sheet Full Digital Conference System Controller d7101
No ratings yet
Data Sheet Full Digital Conference System Controller d7101
4 pages
Ev Dh2010 PDF
No ratings yet
Ev Dh2010 PDF
2 pages
ML - AI Roadmap
No ratings yet
ML - AI Roadmap
14 pages
Sap Credit Management FSCM Overview PDF
100% (1)
Sap Credit Management FSCM Overview PDF
85 pages
CMAT Pamphlet Questions
No ratings yet
CMAT Pamphlet Questions
6 pages
Samplepaper 1
No ratings yet
Samplepaper 1
2 pages
2010 PAARL Library Standards: A Draft Proposal
No ratings yet
2010 PAARL Library Standards: A Draft Proposal
35 pages
Double-Seat-Valves DSV-Complete Sudmo Brochure
No ratings yet
Double-Seat-Valves DSV-Complete Sudmo Brochure
16 pages
EL3202 - 2-Channel Input Terminal PT100 (RTD) For 2-Or 3-Wire Connection
No ratings yet
EL3202 - 2-Channel Input Terminal PT100 (RTD) For 2-Or 3-Wire Connection
2 pages
Vyas Und Biyala
No ratings yet
Vyas Und Biyala
167 pages
Fair Valuation-US Daily Check
No ratings yet
Fair Valuation-US Daily Check
5 pages
Revision Sheet
No ratings yet
Revision Sheet
20 pages
Nature of Transaction: AIR FARE TICKETS Nature of Transaction: AIR FARE TICKETS
No ratings yet
Nature of Transaction: AIR FARE TICKETS Nature of Transaction: AIR FARE TICKETS
1 page
Unit Iii Database Management System
No ratings yet
Unit Iii Database Management System
20 pages
120 hours Computer Course - ISO 9001 Certificate
No ratings yet
120 hours Computer Course - ISO 9001 Certificate
1 page
Tabel A18 Runs Test PDF
No ratings yet
Tabel A18 Runs Test PDF
2 pages
final print reporttt_removed
No ratings yet
final print reporttt_removed
26 pages
PETRONAS FUTURETECH Demo Day
No ratings yet
PETRONAS FUTURETECH Demo Day
26 pages
Introduction of Li-Fi Technology
No ratings yet
Introduction of Li-Fi Technology
4 pages
Final CTS Materials
No ratings yet
Final CTS Materials
85 pages
CIT853
No ratings yet
CIT853
2 pages
JSS 2150 Service Manual (1) Trang 6
No ratings yet
JSS 2150 Service Manual (1) Trang 6
1 page
Emerging Trends in Digital Marketing
No ratings yet
Emerging Trends in Digital Marketing
18 pages

NeuralNetworkforReal-TimeObjectDetectiononFPGA

Uploaded by

NeuralNetworkforReal-TimeObjectDetectiononFPGA

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Neural Network for Real-Time Object Detection on FPGA

Conference Paper · May 2021

Edward Rzaev Aleksandr Amerikanov

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

Neural Network for Real-Time Object Detection on

978-1-7281-4587-7/21/$31.00 ©2021 IEEE

Many neural networks designed for preliminary detection

Fig. 1. The process of detecting objects in the image [6].

B. Converting a model as a firmware to an FPGA

View publication stats

You might also like