Learnings from Federated Learning in the Real world
- URL: http://arxiv.org/abs/2202.03925v1
- Date: Tue, 8 Feb 2022 15:21:31 GMT
- Title: Learnings from Federated Learning in the Real world
- Authors: Christophe Dupuy, Tanya G. Roosta, Leo Long, Clement Chung, Rahul
Gupta, Salman Avestimehr
- Abstract summary: Federated Learning (FL) applied to real world data may suffer from several idiosyncrasies.
Data across devices could be distributed such that there are some "heavy devices" with large amounts of data while there are many "light users" with only a handful of data points.
We evaluate the impact of such idiosyncrasies on Natural Language Understanding (NLU) models trained using FL.
- Score: 19.149989896466852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) applied to real world data may suffer from several
idiosyncrasies. One such idiosyncrasy is the data distribution across devices.
Data across devices could be distributed such that there are some "heavy
devices" with large amounts of data while there are many "light users" with
only a handful of data points. There also exists heterogeneity of data across
devices. In this study, we evaluate the impact of such idiosyncrasies on
Natural Language Understanding (NLU) models trained using FL. We conduct
experiments on data obtained from a large scale NLU system serving thousands of
devices and show that simple non-uniform device selection based on the number
of interactions at each round of FL training boosts the performance of the
model. This benefit is further amplified in continual FL on consecutive time
periods, where non-uniform sampling manages to swiftly catch up with FL methods
using all data at once.
Related papers
- FS-Real: Towards Real-World Cross-Device Federated Learning [60.91678132132229]
Federated Learning (FL) aims to train high-quality models in collaboration with distributed clients while not uploading their local data.
There is still a considerable gap between the flourishing FL research and real-world scenarios, mainly caused by the characteristics of heterogeneous devices and its scales.
We propose an efficient and scalable prototyping system for real-world cross-device FL, FS-Real.
arXiv Detail & Related papers (2023-03-23T15:37:17Z) - Federated Learning and Meta Learning: Approaches, Applications, and
Directions [94.68423258028285]
In this tutorial, we present a comprehensive review of FL, meta learning, and federated meta learning (FedMeta)
Unlike other tutorial papers, our objective is to explore how FL, meta learning, and FedMeta methodologies can be designed, optimized, and evolved, and their applications over wireless networks.
arXiv Detail & Related papers (2022-10-24T10:59:29Z) - FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in
Realistic Healthcare Settings [51.09574369310246]
Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models.
We propose a novel cross-silo dataset suite focused on healthcare, FLamby, to bridge the gap between theory and practice of cross-silo FL.
Our flexible and modular suite allows researchers to easily download datasets, reproduce results and re-use the different components for their research.
arXiv Detail & Related papers (2022-10-10T12:17:30Z) - Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices.
The impact of on-device storage on the performance of FL is still not explored.
In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z) - FLAME: Federated Learning Across Multi-device Environments [9.810211000961647]
Federated Learning (FL) enables distributed training of machine learning models while keeping personal data on user devices private.
We propose FLAME, a user-centered FL training approach to counter statistical and system heterogeneity in multi-device environments.
Our experiment results show that FLAME outperforms various baselines by 4.8-33.8% higher F-1 score, 1.02-2.86x greater energy efficiency, and up to 2.02x speedup in convergence.
arXiv Detail & Related papers (2022-02-17T22:23:56Z) - Multi-Center Federated Learning [62.32725938999433]
Federated learning (FL) can protect data privacy in distributed learning.
It merely collects local gradients from users without access to their data.
We propose a novel multi-center aggregation mechanism.
arXiv Detail & Related papers (2021-08-19T12:20:31Z) - Federated Learning-based Active Authentication on Mobile Devices [98.23904302910022]
User active authentication on mobile devices aims to learn a model that can correctly recognize the enrolled user based on device sensor information.
We propose a novel user active authentication training, termed as Federated Active Authentication (FAA)
We show that existing FL/SL methods are suboptimal for FAA as they rely on the data to be distributed homogeneously.
arXiv Detail & Related papers (2021-04-14T22:59:08Z) - FLaPS: Federated Learning and Privately Scaling [3.618133010429131]
Federated learning (FL) is a distributed learning process where the model is transferred to the devices that posses data.
We present Federated Learning and Privately Scaling (FLaPS) architecture, which improves scalability as well as the security and privacy of the system.
arXiv Detail & Related papers (2020-09-13T14:20:17Z) - Federated Visual Classification with Real-World Data Distribution [9.564468846277366]
We characterize the effect real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm.
We introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits.
We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
arXiv Detail & Related papers (2020-03-18T07:55:49Z) - Communication-Efficient On-Device Machine Learning: Federated
Distillation and Augmentation under Non-IID Private Data [31.85853956347045]
On-device machine learning (ML) enables the training process to exploit a massive amount of user-generated private data samples.
We propose federated distillation (FD), a distributed model training algorithm whose communication payload size is much smaller than a benchmark scheme, federated learning (FL)
We show FD with FAug yields around 26x less communication overhead while achieving 95-98% test accuracy compared to FL.
arXiv Detail & Related papers (2018-11-28T10:16:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.