Related papers: Probabilistic Inference for Learning from Untrusted Sources

Probabilistic Inference for Learning from Untrusted Sources

URL: http://arxiv.org/abs/2101.06171v1
Date: Fri, 15 Jan 2021 15:30:06 GMT
Title: Probabilistic Inference for Learning from Untrusted Sources
Authors: Duc Thien Nguyen, Shiau Hoong Lim, Laura Wynter and Desmond Cai
Abstract summary: Federated learning brings potential benefits of faster learning, better solutions, and a greater propensity to transfer when heterogeneous data from different parties increases diversity. It is important for the aggregation algorithm to be robust to non-IID data and corrupted parties. Recent work assumes that a textitreference dataset is available through which to perform the identification. We consider settings where no such reference dataset is available; rather, the quality and suitability of the parties needs to be textitinferred. We propose novel federated learning aggregation algorithms based on Bayesian inference that adapt to the quality of the parties.
Score: 6.811310452498163
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning brings potential benefits of faster learning, better solutions, and a greater propensity to transfer when heterogeneous data from different parties increases diversity. However, because federated learning tasks tend to be large and complex, and training times non-negligible, it is important for the aggregation algorithm to be robust to non-IID data and corrupted parties. This robustness relies on the ability to identify, and appropriately weight, incompatible parties. Recent work assumes that a \textit{reference dataset} is available through which to perform the identification. We consider settings where no such reference dataset is available; rather, the quality and suitability of the parties needs to be \textit{inferred}. We do so by bringing ideas from crowdsourced predictions and collaborative filtering, where one must infer an unknown ground truth given proposals from participants with unknown quality. We propose novel federated learning aggregation algorithms based on Bayesian inference that adapt to the quality of the parties. Empirically, we show that the algorithms outperform standard and robust aggregation in federated learning on both synthetic and real data.

Related papers

Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants. Our observations are intuitive. Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z)
Non-IID data and Continual Learning processes in Federated Learning: A long road ahead [58.720142291102135]
Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private. In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it. At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
arXiv Detail & Related papers (2021-11-26T09:57:11Z)
On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning [69.48387059607387]
We consider the problem of using expert data with unobserved confounders for imitation and reinforcement learning. We analyze the limitations of learning from confounded expert data with and without external reward. We validate our claims empirically on challenging assistive healthcare and recommender system simulation tasks.
arXiv Detail & Related papers (2021-10-13T07:31:31Z)
Weight Divergence Driven Divide-and-Conquer Approach for Optimal Federated Learning from non-IID Data [0.0]
Federated Learning allows training of data stored in distributed devices without the need for centralizing training data. We propose a novel Divide-and-Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm.
arXiv Detail & Related papers (2021-06-28T09:34:20Z)
Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning. We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class. We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z)
Federated Learning on Non-IID Data Silos: An Experimental Study [34.28108345251376]
Training data have been increasingly fragmented, forming distributed databases of multiple data silos. In this paper, we propose comprehensive data partitioning strategies to cover the typical non-IID data cases. We find that non-IID does bring significant challenges in learning accuracy of FL algorithms, and none of the existing state-of-the-art FL algorithms outperforms others in all cases.
arXiv Detail & Related papers (2021-02-03T14:29:09Z)
Robustness and Personalization in Federated Learning: A Unified Approach via Regularization [4.7234844467506605]
We present a class of methods for robust, personalized federated learning, called Fed+. The principal advantage of Fed+ is to better accommodate the real-world characteristics found in federated training. We demonstrate the benefits of Fed+ through extensive experiments on benchmark datasets.
arXiv Detail & Related papers (2020-09-14T10:04:30Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
On the Sample Complexity of Adversarial Multi-Source PAC Learning [46.24794665486056]
In a single-source setting, an adversary with the power to corrupt a fixed fraction of the training data can prevent PAC-learnability. We show that, surprisingly, the same is not true in the multi-source setting, where the adversary can arbitrarily corrupt a fixed fraction of the data sources. Our results also show that in a cooperative learning setting sharing data with other parties has provable benefits, even if some participants are malicious.
arXiv Detail & Related papers (2020-02-24T17:19:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.