University of Twente Student Theses


CNN Accelerator Throughput Improvement using High-Level Synthesis for FPGA

Minnen, Matthijs van (2022) CNN Accelerator Throughput Improvement using High-Level Synthesis for FPGA.

[img] PDF
Abstract:The number of applications for neural network is growing, which increases the demand for processing power to run these networks. General purpose solutions are available, but specialised hardware can provide better performance at a lower energy cost. An accelerator is developed for FPGA to increase the throughput for the convolutional layers of the YOLOv4 Tiny CNN. Catapult HLS is used to speed up development of the accelerator. Using HLS, a design is developed that is inspired by the Eyeriss architecture. As the tool does not natively infer DSPblocks in the design, a custom design flow is derived to instantiate these blocks to perform the MACC operations. With this implementation, a MACC operation is performed in 1 clock cycle. A schedule is found to optimise the hardware usage for the given CNN, using the Timeloop tool. This yields 99% utilisation of the hardware. The hardware implementation is simplified to meet the throughput requirements for providing data for the MACC operations. With the optimised schedule and improved hardware, it is estimated that the accelerator provides a throughput of 4GOPS, whilst simultaneously reducing the resource utilisation by ~30%, compared to other works.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:53 electrotechnology, 54 computer science
Programme:Electrical Engineering MSc (60353)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page