University of Twente Student Theses


FedNIP : A Statistical Heterogeneity Aware Dynamic Ranking Algorithm for Federated Learning

Zagema, S.G.C. (2024) FedNIP : A Statistical Heterogeneity Aware Dynamic Ranking Algorithm for Federated Learning.

[img] PDF
Abstract:Federated Learning (FL) is a cutting-edge approach to Machine Learning (ML) that allows for the decentralized training of models, without the need for centralizing the raw data. This ensures the privacy of the client, as the actual data never leaves the device. However, a major challenge in FL is that clients often have significant differences in their local data distributions, which leads to a suboptimal convergence speed and decreased accuracy. To address this issue, a novel FL algorithm called Federated Non-IID Performance (FedNIP) is proposed. FedNIP is a dynamic ranking-based exploration and algorithm, prioritizing clients based on their impact on the global model. Unlike other FL algorithms, FedNIP dynamically updates client performance and ranks clients to prevent bias in the training. Clustering is used to group clients based on similar distribution, after which the clusters are passed to the global model for training. A proxy model is used to rank the clients based on performance. Only the weights of the best performing clients are used for training of the global model. Experimental results, conducted using the CIFAR-10 dataset, demonstrate that FedNIP outperforms FedAvg, the most established FL algorithm, and matches FedProx, the most established Non-IID FL algorithm, in highly heterogeneous environments. Scaling the number of clients from 50 to 250 does not change the results. Hereby, using the FedNIP strategy of only using a subset of the clients (using top 10% performing clients in a cluster and 10% of random clients in a cluster) has similar performance compared to using FedNIP where all the clients are utilized in a cluster. This outcome means that examining a subset of clients within a cluster provides a reliable indication of the overall performance of the entire cluster. Which drastically reduces the number of needed clients to be used. FedNIP’s runtime is four to eight times faster than FedAvg and FedProx, dependent on number of clients and level of statistical heterogeneity. Future research should focus on integrating cluster performance as a sampling criterion for each round, instead of the current client proportion-based sampling strategy. Moreover, assessing FedNIP’s performance in real-time environments could provide valuable insights, given its capability to dynamically rank clients based on performance. The code for the implementation of FedNIP can be found here:
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page