Probabilistic Inference for Learning from Untrusted Sources
- URL: http://arxiv.org/abs/2101.06171v1
- Date: Fri, 15 Jan 2021 15:30:06 GMT
- Title: Probabilistic Inference for Learning from Untrusted Sources
- Authors: Duc Thien Nguyen, Shiau Hoong Lim, Laura Wynter and Desmond Cai
- Abstract summary: Federated learning brings potential benefits of faster learning, better solutions, and a greater propensity to transfer when heterogeneous data from different parties increases diversity.
It is important for the aggregation algorithm to be robust to non-IID data and corrupted parties.
Recent work assumes that a textitreference dataset is available through which to perform the identification.
We consider settings where no such reference dataset is available; rather, the quality and suitability of the parties needs to be textitinferred.
We propose novel federated learning aggregation algorithms based on Bayesian inference that adapt to the quality of the parties.
- Score: 6.811310452498163
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning brings potential benefits of faster learning, better
solutions, and a greater propensity to transfer when heterogeneous data from
different parties increases diversity. However, because federated learning
tasks tend to be large and complex, and training times non-negligible, it is
important for the aggregation algorithm to be robust to non-IID data and
corrupted parties. This robustness relies on the ability to identify, and
appropriately weight, incompatible parties. Recent work assumes that a
\textit{reference dataset} is available through which to perform the
identification. We consider settings where no such reference dataset is
available; rather, the quality and suitability of the parties needs to be
\textit{inferred}. We do so by bringing ideas from crowdsourced predictions and
collaborative filtering, where one must infer an unknown ground truth given
proposals from participants with unknown quality. We propose novel federated
learning aggregation algorithms based on Bayesian inference that adapt to the
quality of the parties. Empirically, we show that the algorithms outperform
standard and robust aggregation in federated learning on both synthetic and
real data.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Non-IID data and Continual Learning processes in Federated Learning: A
long road ahead [58.720142291102135]
Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private.
In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it.
At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
arXiv Detail & Related papers (2021-11-26T09:57:11Z) - On Covariate Shift of Latent Confounders in Imitation and Reinforcement
Learning [69.48387059607387]
We consider the problem of using expert data with unobserved confounders for imitation and reinforcement learning.
We analyze the limitations of learning from confounded expert data with and without external reward.
We validate our claims empirically on challenging assistive healthcare and recommender system simulation tasks.
arXiv Detail & Related papers (2021-10-13T07:31:31Z) - Weight Divergence Driven Divide-and-Conquer Approach for Optimal
Federated Learning from non-IID Data [0.0]
Federated Learning allows training of data stored in distributed devices without the need for centralizing training data.
We propose a novel Divide-and-Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm.
arXiv Detail & Related papers (2021-06-28T09:34:20Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Robustness and Personalization in Federated Learning: A Unified Approach
via Regularization [4.7234844467506605]
We present a class of methods for robust, personalized federated learning, called Fed+.
The principal advantage of Fed+ is to better accommodate the real-world characteristics found in federated training.
We demonstrate the benefits of Fed+ through extensive experiments on benchmark datasets.
arXiv Detail & Related papers (2020-09-14T10:04:30Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - On the Sample Complexity of Adversarial Multi-Source PAC Learning [46.24794665486056]
In a single-source setting, an adversary with the power to corrupt a fixed fraction of the training data can prevent PAC-learnability.
We show that, surprisingly, the same is not true in the multi-source setting, where the adversary can arbitrarily corrupt a fixed fraction of the data sources.
Our results also show that in a cooperative learning setting sharing data with other parties has provable benefits, even if some participants are malicious.
arXiv Detail & Related papers (2020-02-24T17:19:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.