FedGA: Federated Learning with Gradient Alignment for Error Asymmetry Mitigation
- URL: http://arxiv.org/abs/2412.16582v1
- Date: Sat, 21 Dec 2024 11:15:20 GMT
- Title: FedGA: Federated Learning with Gradient Alignment for Error Asymmetry Mitigation
- Authors: Chenguang Xiao, Zheming Zuo, Shuo Wang,
- Abstract summary: Federated learning (FL) triggers intra-client and inter-client class imbalance.
We propose a gradient alignment (GA)-informed FL method, dubbed as FedGA.
- Score: 5.3663750040721085
- License:
- Abstract: Federated learning (FL) triggers intra-client and inter-client class imbalance, with the latter compared to the former leading to biased client updates and thus deteriorating the distributed models. Such a bias is exacerbated during the server aggregation phase and has yet to be effectively addressed by conventional re-balancing methods. To this end, different from the off-the-shelf label or loss-based approaches, we propose a gradient alignment (GA)-informed FL method, dubbed as FedGA, where the importance of error asymmetry (EA) in bias is observed and its linkage to the gradient of the loss to raw logits is explored. Concretely, GA, implemented by label calibration during the model backpropagation process, prevents catastrophic forgetting of rate and missing classes, hence boosting model convergence and accuracy. Experimental results on five benchmark datasets demonstrate that GA outperforms the pioneering counterpart FedAvg and its four variants in minimizing EA and updating bias, and accordingly yielding higher F1 score and accuracy margins when the Dirichlet distribution sampling factor $\alpha$ increases. The code and more details are available at \url{https://anonymous.4open.science/r/FedGA-B052/README.md}.
Related papers
- Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy.
We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution.
Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z) - FedImpro: Measuring and Improving Client Update in Federated Learning [77.68805026788836]
Federated Learning (FL) models often experience client drift caused by heterogeneous data.
We present an alternative perspective on client drift and aim to mitigate it by generating improved local models.
arXiv Detail & Related papers (2024-02-10T18:14:57Z) - Generalized Logit Adjustment: Calibrating Fine-tuned Models by Removing Label Bias in Foundation Models [75.9543301303586]
Foundation models like CLIP allow zero-shot transfer on various tasks without additional training data.
Fine-tuning and ensembling are also commonly adopted to better fit the downstream tasks.
However, we argue that prior work has overlooked the inherent biases in foundation models.
arXiv Detail & Related papers (2023-10-12T08:01:11Z) - End-to-End Supervised Multilabel Contrastive Learning [38.26579519598804]
Multilabel representation learning is recognized as a challenging problem that can be associated with either label dependencies between object categories or data-related issues.
Recent advances address these challenges from model- and data-centric viewpoints.
We propose a new end-to-end training framework -- dubbed KMCL -- to address the shortcomings of both model- and data-centric designs.
arXiv Detail & Related papers (2023-07-08T12:46:57Z) - Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration [18.376601653387315]
Longtailed semi-supervised learning (LTSSL) represents a practical scenario for semi-supervised applications.
This problem is often aggravated by discrepancies between labeled and unlabeled class distributions.
We introduce Flexible Distribution Alignment (FlexDA), a novel adaptive logit-adjusted loss framework.
arXiv Detail & Related papers (2023-06-07T17:50:59Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Federated Learning with Label Distribution Skew via Logits Calibration [26.98248192651355]
In this paper, we investigate the label distribution skew in FL, where the distribution of labels varies across clients.
We propose FedLC, which calibrates the logits before softmax cross-entropy according to the probability of occurrence of each class.
Experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model.
arXiv Detail & Related papers (2022-09-01T02:56:39Z) - Posterior Re-calibration for Imbalanced Datasets [33.379680556475314]
Neural Networks can perform poorly when the training label distribution is heavily imbalanced.
We derive a post-training prior rebalancing technique that can be solved through a KL-divergence based optimization.
Our results on six different datasets and five different architectures show state of art accuracy.
arXiv Detail & Related papers (2020-10-22T15:57:14Z) - Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial
Attacks [86.88061841975482]
We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle.
We use this setting to find fast one-step adversarial attacks, akin to a black-box version of the Fast Gradient Sign Method(FGSM)
We show that the method uses fewer queries and achieves higher attack success rates than the current state of the art.
arXiv Detail & Related papers (2020-10-08T18:36:51Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z) - Semi-supervised Learning Meets Factorization: Learning to Recommend with
Chain Graph Model [16.007141894770054]
latent factor model (LFM) has been drawing much attention in recommender systems due to its good performance and scalability.
Semi-supervised learning (SSL) provides an effective way to alleviate the label (i.e., rating) sparsity problem.
We propose a novel probabilistic chain graph model (CGM) to marry SSL with LFM.
arXiv Detail & Related papers (2020-03-05T06:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.