BOBA: Byzantine-Robust Federated Learning with Label Skewness
- URL: http://arxiv.org/abs/2208.12932v2
- Date: Wed, 20 Mar 2024 02:11:56 GMT
- Title: BOBA: Byzantine-Robust Federated Learning with Label Skewness
- Authors: Wenxuan Bao, Jun Wu, Jingrui He,
- Abstract summary: In federated learning, most existing robust aggregation rules (AGRs) combat Byzantine attacks in the IID setting.
We address label skewness, a more realistic and challenging non-IID setting, where each client only has access to a few classes of data.
In this setting, state-of-the-art AGRs suffer from selection bias, leading to significant performance drop for particular classes.
We propose an efficient two-stage method named BOBA to address these limitations.
- Score: 39.75185862573534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In federated learning, most existing robust aggregation rules (AGRs) combat Byzantine attacks in the IID setting, where client data is assumed to be independent and identically distributed. In this paper, we address label skewness, a more realistic and challenging non-IID setting, where each client only has access to a few classes of data. In this setting, state-of-the-art AGRs suffer from selection bias, leading to significant performance drop for particular classes; they are also more vulnerable to Byzantine attacks due to the increased variation among gradients of honest clients. To address these limitations, we propose an efficient two-stage method named BOBA. Theoretically, we prove the convergence of BOBA with an error of the optimal order. Our empirical evaluations demonstrate BOBA's superior unbiasedness and robustness across diverse models and datasets when compared to various baselines. Our code is available at https://github.com/baowenxuan/BOBA .
Related papers
- No Query, No Access [50.18709429731724]
We introduce the textbfVictim Data-based Adrial Attack (VDBA), which operates using only victim texts.<n>To prevent access to the victim model, we create a shadow dataset with publicly available pre-trained models and clustering methods.<n>Experiments on the Emotion and SST5 datasets show that VDBA outperforms state-of-the-art methods, achieving an ASR improvement of 52.08%.
arXiv Detail & Related papers (2025-05-12T06:19:59Z) - Bayesian Robust Aggregation for Federated Learning [42.29248343585333]
Federated Learning enables collaborative training of machine learning models on decentralized data.<n> adversarial attacks, when some of the clients submit corrupted model updates.<n>We propose an adaptive approach for robust aggregation of model updates based on Bayesian inference.
arXiv Detail & Related papers (2025-05-05T09:16:43Z) - Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [53.96714099151378]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models.
Our method improves its two key components: minority samples identification and the robust training algorithm.
Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z) - LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data [7.317765812144531]
We present a benchmark designed to evaluate the quality and realism of generative message-by-order data for limit order books (LOB)
Our framework measures distributional differences in conditional and unconditional statistics between generated and real LOB data.
The benchmark also includes features commonly used LOB statistics such as spread, order book volumes, order imbalance, and message inter-arrival times.
arXiv Detail & Related papers (2025-02-13T10:56:58Z) - Class Balance Matters to Active Class-Incremental Learning [61.11786214164405]
We aim to start from a pool of large-scale unlabeled data and then annotate the most informative samples for incremental learning.
We propose Class-Balanced Selection (CBS) strategy to achieve both class balance and informativeness in chosen samples.
Our CBS can be plugged and played into those CIL methods which are based on pretrained models with prompts tunning technique.
arXiv Detail & Related papers (2024-12-09T16:37:27Z) - Weakly Contrastive Learning via Batch Instance Discrimination and Feature Clustering for Small Sample SAR ATR [7.2932563202952725]
We propose a novel framework named Batch Instance Discrimination and Feature Clustering (BIDFC)
In this framework, embedding distance between samples should be moderate because of the high similarity between samples in the SAR images.
Experimental results on the moving and stationary target acquisition and recognition (MSTAR) database indicate a 91.25% classification accuracy of our method fine-tuned on only 3.13% training data.
arXiv Detail & Related papers (2024-08-07T08:39:33Z) - ADBA:Approximation Decision Boundary Approach for Black-Box Adversarial Attacks [6.253823500300899]
Black-box attacks are stealthy, generating adversarial examples using hard labels from machine learning models.
This paper introduces a novel approach using the Approximation Decision Boundary (ADB) to efficiently and accurately compare perturbation directions.
The effectiveness of our ADB approach (ADBA) hinges on promptly identifying suitable ADB, ensuring reliable differentiation of all perturbation directions.
arXiv Detail & Related papers (2024-06-07T15:09:25Z) - Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated
Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model.
Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z) - Suppressing Poisoning Attacks on Federated Learning for Medical Imaging [4.433842217026879]
We propose a robust aggregation rule called Distance-based Outlier Suppression (DOS) that is resilient to byzantine failures.
The proposed method computes the distance between local parameter updates of different clients and obtains an outlier score for each client.
The resulting outlier scores are converted into normalized weights using a softmax function, and a weighted average of the local parameters is used for updating the global model.
arXiv Detail & Related papers (2022-07-15T00:43:34Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Training image classifiers using Semi-Weak Label Data [26.04162590798731]
In Multiple Instance learning (MIL), weak labels are provided at the bag level with only presence/absence information known.
This paper introduces a novel semi-weak label learning paradigm as a middle ground to mitigate the problem.
We propose a two-stage framework to address the problem of learning from semi-weak labels.
arXiv Detail & Related papers (2021-03-19T03:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.