Robust Federated Learning against Noisy Clients via Masked Optimization
- URL: http://arxiv.org/abs/2506.02079v1
- Date: Mon, 02 Jun 2025 09:35:42 GMT
- Title: Robust Federated Learning against Noisy Clients via Masked Optimization
- Authors: Xuefeng Jiang, Tian Wen, Zhiqin Yang, Lvhua Wu, Yufeng Chen, Sheng Sun, Yuwei Wang, Min Liu,
- Abstract summary: In this study, we present a two-stage optimization framework, MaskedOptim, to address this intricate label noise problem.<n>The first stage is designed to facilitate the detection of noisy clients with higher label noise rates.<n>The second stage focuses on rectifying the labels of the noisy clients' data through an end-to-end label correction mechanism.
- Score: 13.213042997655169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, federated learning (FL) has made significant advance in privacy-sensitive applications. However, it can be hard to ensure that FL participants provide well-annotated data for training. The corresponding annotations from different clients often contain complex label noise at varying levels. This label noise issue has a substantial impact on the performance of the trained models, and clients with greater noise levels can be largely attributed for this degradation. To this end, it is necessary to develop an effective optimization strategy to alleviate the adverse effects of these noisy clients.In this study, we present a two-stage optimization framework, MaskedOptim, to address this intricate label noise problem. The first stage is designed to facilitate the detection of noisy clients with higher label noise rates. The second stage focuses on rectifying the labels of the noisy clients' data through an end-to-end label correction mechanism, aiming to mitigate the negative impacts caused by misinformation within datasets. This is achieved by learning the potential ground-truth labels of the noisy clients' datasets via backpropagation. To further enhance the training robustness, we apply the geometric median based model aggregation instead of the commonly-used vanilla averaged model aggregation. We implement sixteen related methods and conduct evaluations on three image datasets and one text dataset with diverse label noise patterns for a comprehensive comparison. Extensive experimental results indicate that our proposed framework shows its robustness in different scenarios. Additionally, our label correction framework effectively enhances the data quality of the detected noisy clients' local datasets. % Our codes will be open-sourced to facilitate related research communities. Our codes are available via https://github.com/Sprinter1999/MaskedOptim .
Related papers
- Mitigating Instance-Dependent Label Noise: Integrating Self-Supervised Pretraining with Pseudo-Label Refinement [3.272177633069322]
Real-world datasets often contain noisy labels due to human error, ambiguity, or resource constraints during the annotation process.<n>We propose a novel framework that combines self-supervised learning using SimCLR with iterative pseudo-label refinement.<n>Our approach significantly outperforms several state-of-the-art methods, particularly under high noise conditions.
arXiv Detail & Related papers (2024-12-06T09:56:49Z) - Tackling Noisy Clients in Federated Learning with End-to-end Label Correction [20.64304520865249]
We propose a two-stage framework FedELC to tackle this complicated label noise issue.
The first stage aims to guide the detection of noisy clients with higher label noise.
The second stage aims to correct the labels of noisy clients' data via an end-to-end label correction framework.
arXiv Detail & Related papers (2024-08-08T08:35:32Z) - Federated Learning with Extremely Noisy Clients via Negative
Distillation [70.13920804879312]
Federated learning (FL) has shown remarkable success in cooperatively training deep models, while struggling with noisy labels.
We propose a novel approach, called negative distillation (FedNed) to leverage models trained on noisy clients.
FedNed first identifies noisy clients and employs rather than discards the noisy clients in a knowledge distillation manner.
arXiv Detail & Related papers (2023-12-20T01:59:48Z) - Fine tuning Pre trained Models for Robustness Under Noisy Labels [34.68018860186995]
The presence of noisy labels in a training dataset can significantly impact the performance of machine learning models.
We introduce a novel algorithm called TURN, which robustly and efficiently transfers the prior knowledge of pre-trained models.
arXiv Detail & Related papers (2023-10-24T20:28:59Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - FedCorr: Multi-Stage Federated Learning for Label Noise Correction [80.9366438220228]
Federated learning (FL) is a privacy-preserving distributed learning paradigm that enables clients to jointly train a global model.
We propose $textttFedCorr$, a general multi-stage framework to tackle heterogeneous label noise in FL.
Experiments conducted on CIFAR-10/100 with federated synthetic label noise, and on a real-world noisy dataset, Clothing1M, demonstrate that $textttFedCorr$ is robust to label noise.
arXiv Detail & Related papers (2022-04-10T12:51:18Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Training Classifiers that are Universally Robust to All Label Noise
Levels [91.13870793906968]
Deep neural networks are prone to overfitting in the presence of label noise.
We propose a distillation-based framework that incorporates a new subcategory of Positive-Unlabeled learning.
Our framework generally outperforms at medium to high noise levels.
arXiv Detail & Related papers (2021-05-27T13:49:31Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.