Related papers: Byzantine-tolerant distributed learning of finite mixture models

Byzantine-tolerant distributed learning of finite mixture models

URL: http://arxiv.org/abs/2407.13980v2
Date: Mon, 10 Mar 2025 17:31:36 GMT
Title: Byzantine-tolerant distributed learning of finite mixture models
Authors: Qiong Zhang, Yan Shuo Tan, Jiahua Chen,
Abstract summary: This paper introduces Distance Filtered Mixture Reduction (DFMR)<n>DFMR is a Byzantine tolerant adaptation of Mixture Reduction (MR) that is both computationally efficient and statistically sound.<n>We provide theoretical justification for DFMR, proving its optimal convergence rate and equivalence to the global maximum likelihood estimate.
Score: 16.60734923697257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional statistical methods need to be updated to work with modern distributed data storage paradigms. A common approach is the split-and-conquer framework, which involves learning models on local machines and averaging their parameter estimates. However, this does not work for the important problem of learning finite mixture models, because subpopulation indices on each local machine may be arbitrarily permuted (the "label switching problem"). Zhang and Chen (2022) proposed Mixture Reduction (MR) to address this issue, but MR remains vulnerable to Byzantine failure, whereby a fraction of local machines may transmit arbitrarily erroneous information. This paper introduces Distance Filtered Mixture Reduction (DFMR), a Byzantine tolerant adaptation of MR that is both computationally efficient and statistically sound. DFMR leverages the densities of local estimates to construct a robust filtering mechanism. By analysing the pairwise L2 distances between local estimates, DFMR identifies and removes severely corrupted local estimates while retaining the majority of uncorrupted ones. We provide theoretical justification for DFMR, proving its optimal convergence rate and asymptotic equivalence to the global maximum likelihood estimate under standard assumptions. Numerical experiments on simulated and real-world data validate the effectiveness of DFMR in achieving robust and accurate aggregation in the presence of Byzantine failure.

Related papers

A Deep Bayesian Nonparametric Framework for Robust Mutual Information Estimation [9.68824512279232]
Mutual Information (MI) is a crucial measure for capturing dependencies between variables. We present a solution for training an MI estimator by constructing the MI loss with a finite representation of the Dirichlet process posterior to incorporate regularization. We explore the application of our estimator in maximizing MI between the data space and the latent space of a variational autoencoder.
arXiv Detail & Related papers (2025-03-11T21:27:48Z)
Optimal Robust Estimation under Local and Global Corruptions: Stronger Adversary and Smaller Error [10.266928164137635]
Algorithmic robust statistics has traditionally focused on the contamination model where a small fraction of the samples are arbitrarily corrupted. We consider a recent contamination model that combines two kinds of corruptions: (i) small fraction of arbitrary outliers, as in classical robust statistics, and (ii) local perturbations, where samples may undergo bounded shifts on average. We show that information theoretically optimal error can indeed be achieved in time, under an even emphstronger local perturbation model.
arXiv Detail & Related papers (2024-10-22T17:51:23Z)
Kolmogorov-Smirnov GAN [52.36633001046723]
We propose a novel deep generative model, the Kolmogorov-Smirnov Generative Adversarial Network (KSGAN) Unlike existing approaches, KSGAN formulates the learning process as a minimization of the Kolmogorov-Smirnov (KS) distance.
arXiv Detail & Related papers (2024-06-28T14:30:14Z)
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data. Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z)
Interval Abstractions for Robust Counterfactual Explanations [15.954944873701503]
Counterfactual Explanations (CEs) have emerged as a major paradigm in explainable AI research. Existing methods often become invalid when slight changes occur in the parameters of the model they were generated for. We propose a novel interval abstraction technique for machine learning models, which allows us to obtain provable robustness guarantees.
arXiv Detail & Related papers (2024-04-21T18:24:34Z)
Robust Estimation of the Tail Index of a Single Parameter Pareto Distribution from Grouped Data [0.0]
This paper introduces a novel robust estimation technique, the Method of Truncated Moments (MTuM) Inferential justification of MTuM is established by employing the central limit theorem and validating them through a comprehensive simulation study.
arXiv Detail & Related papers (2024-01-26T01:42:06Z)
Federated Learning Robust to Byzantine Attacks: Achieving Zero Optimality Gap [21.50616436951285]
We propose a robust aggregation method for federated learning (FL) that can effectively tackle malicious Byzantine attacks. At each user, model parameter is updated by multiple steps, which is adjustable over iterations, and then pushed to the aggregation center directly.
arXiv Detail & Related papers (2023-08-21T02:43:38Z)
DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning [52.83811558753284]
Decentralized learning (DFL) discards the central server and establishes a decentralized communication network. Existing DFL methods still suffer from two major challenges: local inconsistency and local overfitting.
arXiv Detail & Related papers (2023-08-16T11:22:36Z)
Convergence of uncertainty estimates in Ensemble and Bayesian sparse model discovery [4.446017969073817]
We show empirical success in terms of accuracy and robustness to noise with bootstrapping-based sequential thresholding least-squares estimator. We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate.
arXiv Detail & Related papers (2023-01-30T04:07:59Z)
Scalable Dynamic Mixture Model with Full Covariance for Probabilistic Traffic Forecasting [14.951166842027819]
We propose a dynamic mixture of zero-mean Gaussian distributions for the time-varying error process. The proposed method can be seamlessly integrated into existing deep-learning frameworks with only a few additional parameters to be learned. We evaluate the proposed method on a traffic speed forecasting task and find that our method not only improves model horizons but also provides interpretabletemporal correlation structures.
arXiv Detail & Related papers (2022-12-10T22:50:00Z)
Security-Preserving Federated Learning via Byzantine-Sensitive Triplet Distance [10.658882342481542]
Federated learning (FL) is generally vulnerable to Byzantine attacks from adversarial edge devices. We propose an effective Byzantine-robust FL framework, namely dummy contrastive aggregation. We show improved performance as compared to the state-of-the-art Byzantine-resilient aggregation methods.
arXiv Detail & Related papers (2022-10-29T07:20:02Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
Bayesian Evidential Learning for Few-Shot Classification [22.46281648187903]
Few-Shot Classification aims to generalize from base classes to novel classes given very limited labeled samples. State-of-the-art solutions involve learning to find a good metric and representation space to compute the distance between samples. Despite the promising accuracy performance, how to model uncertainty for metric-based FSC methods effectively is still a challenge.
arXiv Detail & Related papers (2022-07-19T03:58:00Z)
End-to-End Multi-Object Detection with a Regularized Mixture Model [26.19278003378703]
Recent end-to-end multi-object detectors simplify the inference pipeline by removing hand-crafted processes. We propose a novel framework to train an end-to-end multi-object detector consisting of only two terms: negative log-likelihood (NLL) and a regularization term.
arXiv Detail & Related papers (2022-05-18T04:20:23Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Smoothed Embeddings for Certified Few-Shot Learning [63.68667303948808]
We extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. Our results are confirmed by experiments on different datasets.
arXiv Detail & Related papers (2022-02-02T18:19:04Z)
Entropy Minimizing Matrix Factorization [102.26446204624885]
Nonnegative Matrix Factorization (NMF) is a widely-used data analysis technique, and has yielded impressive results in many real-world tasks. In this study, an Entropy Minimizing Matrix Factorization framework (EMMF) is developed to tackle the above problem. Considering that the outliers are usually much less than the normal samples, a new entropy loss function is established for matrix factorization.
arXiv Detail & Related papers (2021-03-24T21:08:43Z)
Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles. Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center. We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes. A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)
Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system. In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX) The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
Modal Regression based Structured Low-rank Matrix Recovery for Multi-view Learning [70.57193072829288]
Low-rank Multi-view Subspace Learning has shown great potential in cross-view classification in recent years. Existing LMvSL based methods are incapable of well handling view discrepancy and discriminancy simultaneously. We propose Structured Low-rank Matrix Recovery (SLMR), a unique method of effectively removing view discrepancy and improving discriminancy.
arXiv Detail & Related papers (2020-03-22T03:57:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.