Related papers: Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference

Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference

URL: http://arxiv.org/abs/2103.02860v1
Date: Thu, 4 Mar 2021 06:50:52 GMT
Title: Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference
Authors: Jiyuan Tu, Weidong Liu, Xiaojun Mao, and Xi Chen
Abstract summary: This paper develops an efficient distributed inference algorithm, which is robust against a moderate fraction of Byzantine nodes. To the best of our knowledge, this is the first normality result in the setting of Byzantine-robust distributed learning.
Score: 9.983824799552774
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper develops an efficient distributed inference algorithm, which is robust against a moderate fraction of Byzantine nodes, namely arbitrary and possibly adversarial machines in a distributed learning system. In robust statistics, the median-of-means (MOM) has been a popular approach to hedge against Byzantine failures due to its ease of implementation and computational efficiency. However, the MOM estimator has the shortcoming in terms of statistical efficiency. The first main contribution of the paper is to propose a variance reduced median-of-means (VRMOM) estimator, which improves the statistical efficiency over the vanilla MOM estimator and is computationally as efficient as the MOM. Based on the proposed VRMOM estimator, we develop a general distributed inference algorithm that is robust against Byzantine failures. Theoretically, our distributed algorithm achieves a fast convergence rate with only a constant number of rounds of communications. We also provide the asymptotic normality result for the purpose of statistical inference. To the best of our knowledge, this is the first normality result in the setting of Byzantine-robust distributed learning. The simulation results are also presented to illustrate the effectiveness of our method.

Related papers

Beyond Sin-Squared Error: Linear-Time Entrywise Uncertainty Quantification for Streaming PCA [5.749787074942513]
We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja's algorithm.<n>We derive a sharp Bernstein-type concentration bound for elements of the estimated vector matching the optimal error rate up to logarithmic factors.<n>To efficiently estimate the coordinate-wise variance, we introduce a provably consistent subsampling algorithm.
arXiv Detail & Related papers (2025-06-14T22:50:54Z)
Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings [24.07815507403025]
Estimating the distribution of outcomes under counterfactual policies is critical for decision-making in domains such as recommendation, advertising, and healthcare.<n>We analyze a novel framework-Counterfactual Policy Mean Embedding (CPME)-that represents the entire counterfactual outcome distribution in a reproducing kernel Hilbert space.
arXiv Detail & Related papers (2025-06-03T12:16:46Z)
Robust Estimation for Kernel Exponential Families with Smoothed Total Variation Distances [2.317910166616341]
In statistical inference, we commonly assume that samples are independent and identically distributed from a probability distribution. In this paper, we explore the application of GAN-like estimators to a general class of statistical models.
arXiv Detail & Related papers (2024-10-28T05:50:47Z)
Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose. In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z)
Scalable Bayesian inference for the generalized linear mixed model [2.45365913654612]
We introduce a statistical inference algorithm at the intersection of AI and Bayesian inference. Our algorithm is an extension of gradient MCMC with novel contributions that address the treatment of correlated data. We apply our algorithm to a large electronic health records database.
arXiv Detail & Related papers (2024-03-05T14:35:34Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
MFABA: A More Faithful and Accelerated Boundary-based Attribution Method for Deep Neural Networks [69.28125286491502]
We introduce MFABA, an attribution algorithm that adheres to axioms. Results demonstrate its superiority by achieving over 101.5142 times faster speed than the state-of-the-art attribution algorithms.
arXiv Detail & Related papers (2023-12-21T07:48:15Z)
Distributed Semi-Supervised Sparse Statistical Inference [6.685997976921953]
A debiased estimator is a crucial tool in statistical inference for high-dimensional model parameters. Traditional methods require computing a debiased estimator on every machine. An efficient multi-round distributed debiased estimator, which integrates both labeled and unlabelled data, is developed.
arXiv Detail & Related papers (2023-06-17T17:30:43Z)
Fast and Robust Sparsity Learning over Networks: A Decentralized Surrogate Median Regression Approach [10.850336820582678]
We propose a decentralized surrogate median regression (deSMR) method for efficiently solving the decentralized sparsity learning problem. Our proposed algorithm enjoys a linear convergence rate with a simple implementation. We also establish the theoretical results for sparse support recovery.
arXiv Detail & Related papers (2022-02-11T08:16:01Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization [69.07420650261649]
We introduce a novel, simple, and powerful contrastive MI estimator named as FLO. Empirically, our FLO estimator overcomes the limitations of its predecessors and learns more efficiently. The utility of FLO is verified using an extensive set of benchmarks, which also reveals the trade-offs in practical MI estimation.
arXiv Detail & Related papers (2021-07-02T15:20:41Z)
Communication-efficient Byzantine-robust distributed learning with statistical guarantee [2.8407238889624877]
Communication efficiency and robustness are two major issues in modern distributed learning framework. This paper develops two communication-efficient and robust distributed learning algorithms for convex problems.
arXiv Detail & Related papers (2021-02-28T01:38:37Z)
Estimating Average Treatment Effects with Support Vector Machines [77.34726150561087]
Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature. We adapt SVM as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups. We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods.
arXiv Detail & Related papers (2021-02-23T20:22:56Z)
Efficient semidefinite-programming-based inference for binary and multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF. We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.