Thwarting Semantic Backdoor Attacks in Privacy Preserving Federated Learning

Author(s): Möllering, Helen (2019)

Abstract:
Federated learning is a new approach for privacy-preserving machine learning which lets clients collaboratively train a shared machine-learning model while keeping all training data locally on the clients’ devices and sharing only model updates to be aggregated at the server. Despite the improved data privacy, compared to other distributed solutions where (private) data is shared with other parties, current federated learning deployments are vulnerable to model-poisoning attacks that manipulate the training process so that the shared model exhibits malicious behavior. Concretely, it is possible to inject so called “backdoors” into machine learning models such that the model behaves normally on regular data, but causes a targeted misclassification on attacker-chosen samples. In this work, we study state-of-the-art backdoor attacks and defenses on federated learning. Furthermore, we introduce the first formal definitions for all types of backdoor attacks that have been proposed so far. We implement one of these attacks, namely “semantic backdooring”, and investigate its e˙ectiveness, stealthiness, and durability in extensive experiments. Building on these insights, we explore the solution space to protect against semantic backdoor attacks in the context of model poisoning. We define the requirements that a defense system should fulfill, and we propose three orthogonal techniques to detect malicious contributions. The first defensive technique uses statistical methods applied to fine-grained misclassification distributions to amplify the indications for poisoning. The second technique applies neural-network activation clustering to distinguish clean from infected classes. The last defensive layer is a client-driven feedback loop that allows to increase the available data for the analyses. All three techniques can be flexibly operated solo or in concert. We empirically analyze the impact of our three defense layers through extensive experiments for testing all defenses, individually and in combination, against state-of-the-art semantic backdoor attacks on federated learning with the CIFAR-10 data set. Finally, we evaluate the e˙ectiveness of our proposals based on the aforementioned requirements, and elaborate on limitations and future work.

Document(s):

moellering_MA_eemcs.pdf