University of Twente Student Theses


Accelerating selective sweep detection software with the GPU architecture

Corts, R. (2022) Accelerating selective sweep detection software with the GPU architecture.

[img] PDF
Abstract:Selective sweep detection software processes sequenced genomic data to localize targets of recent and strong positive selection. These targets are found by analyzing Single-Nucleotide Polymorphisms (SNPs) in the genomic data which is stored as Multiple Sequence Alignments (MSAs). Due to advances in DNA sequencing, the amount of DNA data available is increasing rapidly. This causes Bioinformatics workloads to become more complex and especially more computational demanding. As multiple sequence alignment algorithms have been a big topic of research to cope with the surge of genomic data, Bioinformatics algorithms further up the processing pipeline, as selective sweep detection software, reveal high execution times for these large amounts of genomic data. This master thesis describes acceleration of a state-of-the-art selective sweep detection tool called OmegaPlus [1], [2], using the GPU architecture. The goal of this project is to boost performance of the selective sweep detection tool by utilizing the massively parallel architecture of the GPU and by providing a stepping stone for further research on this topic. The project focuses on implementing an optimized GPU kernel in which several GPU acceleration techniques are applied. The developed solution extends OmegaPlus with GPU-acceleration capabilities using the OpenCL General-Purpose GPU (GPGPU) framework [3]. OmegaPlus is based on Linkage Disequilibrium (LD), which is the non-random association of SNPs on different positions in the genomic data. The tool implements the ω-statistic that uses LD to accurately localize selective sweeps. Both the LD computation and the ω-statistic computation are compute intensive parts of the tool which together take up >95% of the total execution time. These compute intensive parts are targeted for GPU-acceleration. The LD computation in OmegaPlus is accelerated using an adaptation of an existing, highly optimized tool that utilizes Dense Linear Algebra (DLA) operations mapped on the GPU architecture. The ω-statistic computation is accelerated using a novel dynamic approach for different workloads. Two kernels have been developed for either a high or low ω-statistic workload. A performance evaluation using simulated datasets showed that the GPU-accelerated ω-statistic computation is up to 3.37x faster than the corresponding sequential implementation and the GPU-accelerated LD computation is up to 33.75x faster than the corresponding sequential implementation using the same system for both versions. The complete GPU-accelerated OmegaPlus version, including both LD and ω-statistic computation, showed speedups up to 12.02x over the sequential OmegaPlus version using the same system for both versions. Speedups indicate a boost in performance but can be improved by applying additional acceleration techniques.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:53 electrotechnology, 54 computer science
Programme:Embedded Systems MSc (60331)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page