Robust Online and Distributed Mean Estimation Under Adversarial Data
Corruption
- URL: http://arxiv.org/abs/2209.09624v1
- Date: Sat, 17 Sep 2022 16:36:21 GMT
- Title: Robust Online and Distributed Mean Estimation Under Adversarial Data
Corruption
- Authors: Tong Yao and Shreyas Sundaram
- Abstract summary: We study robust mean estimation in an online and distributed scenario in the presence of adversarial data attacks.
We provide the error-bound and the convergence properties of the estimates to incorporate the true mean under our algorithms.
- Score: 1.9199742103141069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study robust mean estimation in an online and distributed scenario in the
presence of adversarial data attacks. At each time step, each agent in a
network receives a potentially corrupted data point, where the data points were
originally independent and identically distributed samples of a random
variable. We propose online and distributed algorithms for all agents to
asymptotically estimate the mean. We provide the error-bound and the
convergence properties of the estimates to the true mean under our algorithms.
Based on the network topology, we further evaluate each agent's trade-off in
convergence rate between incorporating data from neighbors and learning with
only local observations.
Related papers
- Differentially-Private Collaborative Online Personalized Mean Estimation [22.399703712241546]
We consider the problem of collaborative personalized mean estimation under a privacy constraint.
Two privacy mechanisms and two data variance estimation schemes are proposed.
We show that collaboration provides faster convergence than a fully local approach.
arXiv Detail & Related papers (2024-11-11T16:14:56Z) - ROSS:RObust decentralized Stochastic learning based on Shapley values [21.376454436691795]
A group of agents collaborate to learn a global model using a distributed dataset without a central server.
The data may be distributed non-independently and identically, and even be noised or poisoned.
We propose ROSS, a robust decentralized learning algorithm based on Shapley values.
arXiv Detail & Related papers (2024-11-01T05:05:15Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a method called Stratified Prediction-Powered Inference (StratPPI)
We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Distributed Bayesian Estimation in Sensor Networks: Consensus on
Marginal Densities [15.038649101409804]
We derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables.
We leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents.
This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest.
arXiv Detail & Related papers (2023-12-02T21:10:06Z) - Robust Online Covariance and Sparse Precision Estimation Under Arbitrary
Data Corruption [1.5850859526672516]
We introduce a modified trimmed-inner-product algorithm to robustly estimate the covariance in an online scenario.
We provide the error-bound and convergence properties of the estimates to the true precision matrix under our algorithms.
arXiv Detail & Related papers (2023-09-16T05:37:28Z) - Probabilistic Matching of Real and Generated Data Statistics in Generative Adversarial Networks [0.6906005491572401]
We propose a method to ensure that the distributions of certain generated data statistics coincide with the respective distributions of the real data.
We evaluate the method on a synthetic dataset and a real-world dataset and demonstrate improved performance of our approach.
arXiv Detail & Related papers (2023-06-19T14:03:27Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.