Gradient Coreset for Federated Learning
- URL: http://arxiv.org/abs/2401.06989v1
- Date: Sat, 13 Jan 2024 06:17:17 GMT
- Title: Gradient Coreset for Federated Learning
- Authors: Durga Sivasubramanian, Lokesh Nagalapatti, Rishabh Iyer, Ganesh
Ramakrishnan
- Abstract summary: Federated Learning (FL) is used to learn machine learning models with data partitioned across multiple clients.
We propose an algorithm that selects a coreset at each client, only every $K$ communication rounds.
We demonstrate that our coreset selection technique is highly effective in accounting for noise in clients' data.
- Score: 27.04322811181904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Learning (FL) is used to learn machine learning models with data
that is partitioned across multiple clients, including resource-constrained
edge devices. It is therefore important to devise solutions that are efficient
in terms of compute, communication, and energy consumption, while ensuring
compliance with the FL framework's privacy requirements. Conventional
approaches to these problems select a weighted subset of the training dataset,
known as coreset, and learn by fitting models on it. Such coreset selection
approaches are also known to be robust to data noise. However, these approaches
rely on the overall statistics of the training data and are not easily
extendable to the FL setup.
In this paper, we propose an algorithm called Gradient based Coreset for
Robust and Efficient Federated Learning (GCFL) that selects a coreset at each
client, only every $K$ communication rounds and derives updates only from it,
assuming the availability of a small validation dataset at the server. We
demonstrate that our coreset selection technique is highly effective in
accounting for noise in clients' data. We conduct experiments using four
real-world datasets and show that GCFL is (1) more compute and energy efficient
than FL, (2) robust to various kinds of noise in both the feature space and
labels, (3) preserves the privacy of the validation dataset, and (4) introduces
a small communication overhead but achieves significant gains in performance,
particularly in cases when the clients' data is noisy.
Related papers
- Optimizing Federated Learning by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance.
FedOptEnt is designed to mitigate performance issues caused by label distribution skew.
The proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy.
arXiv Detail & Related papers (2024-11-02T13:31:36Z) - TPFL: Tsetlin-Personalized Federated Learning with Confidence-Based Clustering [0.0]
We propose a novel approach called Tsetlin-Personalized Federated Learning.
In this way, models are grouped into clusters based on their confidence towards a specific class.
Clients share only what they are confident about, resulting in the elimination of wrongful weight aggregation.
Results demonstrated that TPFL performance better than baseline methods with 98.94% accuracy on MNIST, 98.52% accuracy on FashionMNIST and 91.16% accuracy on FEMNIST dataset.
arXiv Detail & Related papers (2024-09-16T15:27:35Z) - FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm.
It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities.
It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z) - Communication Efficient and Provable Federated Unlearning [43.178460522012934]
We study federated unlearning, a novel problem to eliminate the impact of specific clients or data points on the global model learned via federated learning (FL)
This problem is driven by the right to be forgotten and the privacy challenges in FL.
We introduce a new framework for exact federated unlearning that meets two essential criteria: textitcommunication efficiency and textitexact unlearning provability.
arXiv Detail & Related papers (2024-01-19T20:35:02Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - FedDBL: Communication and Data Efficient Federated Deep-Broad Learning
for Histopathological Tissue Classification [65.7405397206767]
We propose Federated Deep-Broad Learning (FedDBL) to achieve superior classification performance with limited training samples and only one-round communication.
FedDBL greatly outperforms the competitors with only one-round communication and limited training samples, while it even achieves comparable performance with the ones under multiple-round communications.
Since no data or deep model sharing across different clients, the privacy issue is well-solved and the model security is guaranteed with no model inversion attack risk.
arXiv Detail & Related papers (2023-02-24T14:27:41Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - ON-DEMAND-FL: A Dynamic and Efficient Multi-Criteria Federated Learning
Client Deployment Scheme [37.099990745974196]
We introduce an On-Demand-FL, a client deployment approach for federated learning.
We make use of containerization technology such as Docker to build efficient environments.
The Genetic algorithm (GA) is used to solve the multi-objective optimization problem.
arXiv Detail & Related papers (2022-11-05T13:41:19Z) - DReS-FL: Dropout-Resilient Secure Federated Learning for Non-IID Clients
via Secret Data Sharing [7.573516684862637]
Federated learning (FL) strives to enable collaborative training of machine learning models without centrally collecting clients' private data.
This paper proposes a Dropout-Resilient Secure Federated Learning framework based on Lagrange computing.
We show that DReS-FL is resilient to client dropouts and provides privacy protection for the local datasets.
arXiv Detail & Related papers (2022-10-06T05:04:38Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Federated Noisy Client Learning [105.00756772827066]
Federated learning (FL) collaboratively aggregates a shared global model depending on multiple local clients.
Standard FL methods ignore the noisy client issue, which may harm the overall performance of the aggregated model.
We propose Federated Noisy Client Learning (Fed-NCL), which is a plug-and-play algorithm and contains two main components.
arXiv Detail & Related papers (2021-06-24T11:09:17Z) - CatFedAvg: Optimising Communication-efficiency and Classification
Accuracy in Federated Learning [2.2172881631608456]
We introduce a new family of Federated Learning algorithms called CatFedAvg.
It improves the communication efficiency but improves the quality of learning using a category coverage inNIST strategy.
Our experiments show that an increase of 10% absolute points accuracy using the M dataset with 70% absolute points lower network transfer over FedAvg.
arXiv Detail & Related papers (2020-11-14T06:52:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.