University of Twente Student Theses

Login

Neural Network Backdoor Removal by Reconstructing Triggers and Pruning Channels

Koldenhof, Dylan (2023) Neural Network Backdoor Removal by Reconstructing Triggers and Pruning Channels.

[img] PDF
3MB
Abstract:Backdoor attacks in neural networks are a threat in certain applications with important requirements for safety, such as autonomous driving. Current backdoor defense methods are limited either in effectiveness, speed or insight into the nature of the attack. In this work, we propose a new backdoor defense method that combines trigger reconstruction with pruning, allowing for relatively fast mitigation while also giving insight into the nature of the attack. The method was evaluated on various model architectures trained on the GTSRB dataset, with a patch trigger and a blended trigger. On large networks, the proposed method shows better performance than the methods of Dhonthi et al. and CLP, which served as inspiration for the new method. However, the method lacks consistency, with the performance varying significantly even on models of the same architecture, trained with the same backdoor and dataset, only differing in their weight initialization. This was also observed for the other evaluated defense methods. Furthermore, the new method is faster than the similar method of Dhonthi et al. and reconstructs the backdoor triggers reasonably well. The modular nature of the proposed method allows for many directions for improvements in future work.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:https://purl.utwente.nl/essays/97198
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page