University of Twente Student Theses
Neural Network Backdoor Removal by Reconstructing Triggers and Pruning Channels
Koldenhof, Dylan (2023) Neural Network Backdoor Removal by Reconstructing Triggers and Pruning Channels.
PDF
3MB |
Abstract: | Backdoor attacks in neural networks are a threat in certain applications with important requirements for safety, such as autonomous driving. Current backdoor defense methods are limited either in effectiveness, speed or insight into the nature of the attack. In this work, we propose a new backdoor defense method that combines trigger reconstruction with pruning, allowing for relatively fast mitigation while also giving insight into the nature of the attack. The method was evaluated on various model architectures trained on the GTSRB dataset, with a patch trigger and a blended trigger. On large networks, the proposed method shows better performance than the methods of Dhonthi et al. and CLP, which served as inspiration for the new method. However, the method lacks consistency, with the performance varying significantly even on models of the same architecture, trained with the same backdoor and dataset, only differing in their weight initialization. This was also observed for the other evaluated defense methods. Furthermore, the new method is faster than the similar method of Dhonthi et al. and reconstructs the backdoor triggers reasonably well. The modular nature of the proposed method allows for many directions for improvements in future work. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Computer Science MSc (60300) |
Link to this item: | https://purl.utwente.nl/essays/97198 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page