University of Twente Student Theses


Software caching for tree-based algorithms on accelerator cards

Knoben, P.A.H. (2021) Software caching for tree-based algorithms on accelerator cards.

[img] PDF
Abstract:Modern hardware accelerator cards create an accessible platform for developers to reduce execution times for computationally expensive algorithms. A powerful computing system can be created by combining a hardware accelerator and a processor. In this system the processor facilitates the overhead control of an algorithm and the hardware accelerator provides computational power for the intensive calculation in the algorithm. Such a hybrid system also has a downside. Most widely used systems have dedicated memory spaces, resulting in the processor having to transfer data to the accelerator card memory space before the computation can be executed. Currently the performance increase from using a accelerator card for data-intensive algorithms is limited by the data movement, this is called the memory bottleneck. This research aims to reduce the effect of this memory bottleneck and improve overall performance by caching data on the accelerator memory. Caching is a proven technique in other fields as a method to reduce data movements and shows potential for hardware accelerator cards in certain use cases. Caching exploits the locality of data to reduce data movements. Tree-based algorithms inherently provide this locality of data due to the structure of the tree. The goal is to design a software based cache running on the host processor to utilize the accelerator card memory space as cache memory to verify that caching can alleviate the effect of the memory bottleneck. The designed cache is tested on the Phylogenetic maximum likelihood function, a data intensive tree-based algorithms to calculate evolutionary relations between different species based on genetic characteristics. This algorithm has shown to be susceptive to the memory bottleneck. The tests showed that in the best scenarios the number of data movements from host to accelerator card memory for the phylogenetic likelihood function is reduced by 90%. This provides a reduction in the total execution time between 31.6% and 39.9% for different tree sizes. Given the results caching on hardware accelerator cards shows potential in reducing the memory bottleneck for tree-based algorithms and is worth further investigating on other algorithms.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:50 technical science in general, 54 computer science
Programme:Embedded Systems MSc (60331)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page