Variance Reduced Median-of-Means Estimator for Byzantine-Robust
Distributed Inference
- URL: http://arxiv.org/abs/2103.02860v1
- Date: Thu, 4 Mar 2021 06:50:52 GMT
- Title: Variance Reduced Median-of-Means Estimator for Byzantine-Robust
Distributed Inference
- Authors: Jiyuan Tu, Weidong Liu, Xiaojun Mao, and Xi Chen
- Abstract summary: This paper develops an efficient distributed inference algorithm, which is robust against a moderate fraction of Byzantine nodes.
To the best of our knowledge, this is the first normality result in the setting of Byzantine-robust distributed learning.
- Score: 9.983824799552774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper develops an efficient distributed inference algorithm, which is
robust against a moderate fraction of Byzantine nodes, namely arbitrary and
possibly adversarial machines in a distributed learning system. In robust
statistics, the median-of-means (MOM) has been a popular approach to hedge
against Byzantine failures due to its ease of implementation and computational
efficiency. However, the MOM estimator has the shortcoming in terms of
statistical efficiency. The first main contribution of the paper is to propose
a variance reduced median-of-means (VRMOM) estimator, which improves the
statistical efficiency over the vanilla MOM estimator and is computationally as
efficient as the MOM. Based on the proposed VRMOM estimator, we develop a
general distributed inference algorithm that is robust against Byzantine
failures. Theoretically, our distributed algorithm achieves a fast convergence
rate with only a constant number of rounds of communications. We also provide
the asymptotic normality result for the purpose of statistical inference. To
the best of our knowledge, this is the first normality result in the setting of
Byzantine-robust distributed learning. The simulation results are also
presented to illustrate the effectiveness of our method.
Related papers
- Scalable Bayesian inference for the generalized linear mixed model [2.45365913654612]
We introduce a statistical inference algorithm at the intersection of AI and Bayesian inference.
Our algorithm is an extension of gradient MCMC with novel contributions that address the treatment of correlated data.
We apply our algorithm to a large electronic health records database.
arXiv Detail & Related papers (2024-03-05T14:35:34Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - MFABA: A More Faithful and Accelerated Boundary-based Attribution Method
for Deep Neural Networks [69.28125286491502]
We introduce MFABA, an attribution algorithm that adheres to axioms.
Results demonstrate its superiority by achieving over 101.5142 times faster speed than the state-of-the-art attribution algorithms.
arXiv Detail & Related papers (2023-12-21T07:48:15Z) - Distributed Semi-Supervised Sparse Statistical Inference [6.685997976921953]
A debiased estimator is a crucial tool in statistical inference for high-dimensional model parameters.
Traditional methods require computing a debiased estimator on every machine.
An efficient multi-round distributed debiased estimator, which integrates both labeled and unlabelled data, is developed.
arXiv Detail & Related papers (2023-06-17T17:30:43Z) - Fast and Robust Sparsity Learning over Networks: A Decentralized
Surrogate Median Regression Approach [10.850336820582678]
We propose a decentralized surrogate median regression (deSMR) method for efficiently solving the decentralized sparsity learning problem.
Our proposed algorithm enjoys a linear convergence rate with a simple implementation.
We also establish the theoretical results for sparse support recovery.
arXiv Detail & Related papers (2022-02-11T08:16:01Z) - Distribution Regression with Sliced Wasserstein Kernels [45.916342378789174]
We propose the first OT-based estimator for distribution regression.
We study the theoretical properties of a kernel ridge regression estimator based on such representation.
arXiv Detail & Related papers (2022-02-08T15:21:56Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Tight Mutual Information Estimation With Contrastive Fenchel-Legendre
Optimization [69.07420650261649]
We introduce a novel, simple, and powerful contrastive MI estimator named as FLO.
Empirically, our FLO estimator overcomes the limitations of its predecessors and learns more efficiently.
The utility of FLO is verified using an extensive set of benchmarks, which also reveals the trade-offs in practical MI estimation.
arXiv Detail & Related papers (2021-07-02T15:20:41Z) - Communication-efficient Byzantine-robust distributed learning with
statistical guarantee [2.8407238889624877]
Communication efficiency and robustness are two major issues in modern distributed learning framework.
This paper develops two communication-efficient and robust distributed learning algorithms for convex problems.
arXiv Detail & Related papers (2021-02-28T01:38:37Z) - Estimating Average Treatment Effects with Support Vector Machines [77.34726150561087]
Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature.
We adapt SVM as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups.
We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods.
arXiv Detail & Related papers (2021-02-23T20:22:56Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.