University of Twente Student Theses
Real-Time YOLOv4 FPGA Design with Catapult High-Level Synthesis
Heinsius, L.R. (2021) Real-Time YOLOv4 FPGA Design with Catapult High-Level Synthesis.
PDF
5MB |
Abstract: | State-of-the-art object detectors play a vital role in identifying and localizing objects in images, especially during recent years with the up-rise of autonomous systems. This work develops a FPGA-based design for the real-time deep neural network (DNN) based object detector called YOLOv4. The design is targeting the ZedBoard which integrates a Xilinx Zynq-7020 SoC. A single-core bare-metal application integrating the TensorFlow Lite Micro (TFLM) framework provides a base platform to run a quantized version of YOLOv4. Convolutional layers, taking 99.67% of the total execution time, are speed up by a proof-of-concept accelerator. The accelerator has been designed based on the existing Eyeriss accelerator architecture. The accelerator is implemented using High-Level Synthesis (HLS) C++ and gets synthesized to RTL via the Catapult HLS Platform. Integrating the accelerator with the TFLM framework shows speedups of convolutional layers of up to 11.67 times, a drop in energy consumption by a factor of 2.73, and bit-accurate accuracy compared to the original algorithm. Although a speedup is realized, real-time performance is not achieved. This is because of the complex architecture of the Eyeriss accelerator in combination with the limited time set for this project and the limited resources available on the FPGA. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 50 technical science in general, 53 electrotechnology, 54 computer science |
Programme: | Embedded Systems MSc (60331) |
Link to this item: | https://purl.utwente.nl/essays/86465 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page