University of Twente Student Theses

Login

SPaS: Sparse Parameterized Shortcut Connections for Dynamic Sparse-to-Sparse Training

Muller, M.W. (2023) SPaS: Sparse Parameterized Shortcut Connections for Dynamic Sparse-to-Sparse Training.

[img] PDF
2MB
Abstract:Sparse Neural Networks (SNNs) have proven themselves to be an effective method for both the reduction of computational costs and the improvement of performance in Neural Networks (NNs). Sparse-to-sparse (STS) training methods have managed to supersede pruning methods by allowing an optimal topology to be found during training. Dynamic Sparse Training (DST) methods like SET have managed to quadratically reduce the number of parameters, with no decrease in accuracy while shortcut connections have demonstrated their ability to enable deeper and better performing networks such as in ResNets and DenseNets. Recent works have investigated sparsity for shortcut connections and have demonstrated both improvements in performance and reduction in parameter counts over purely sparse sequential models. In this thesis, we introduce Sparse Parameterized Shortcut Connections (SPaS), which combines the principles of sparsity and shortcut connections, and a training schema, SPaSET, that enables SPaS networks to be trained dynamically. SPaS improves information flow within networks and enables feature reuse. We apply SPaS to two typical deep learning architectures, i.e. Multi Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs), and evaluate these on computer vision and numerical classification tasks. We find that SPaS with SPaSET enables MLPs to be compressed over 25x without compromising on performance, up from compression rates of 5x on plain MLPs. In CNNs we find that SPaS with SPaSET improves performance over high density regions for similar inference FLOPs, providing 2~6% increases over the validation set compared to plain CNNs and DenseNets.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:https://purl.utwente.nl/essays/97292
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page