Related papers: FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios

FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios

URL: http://arxiv.org/abs/2507.14980v1
Date: Sun, 20 Jul 2025 14:24:57 GMT
Title: FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios
Authors: Tianle Li, Yongzhi Huang, Linshan Jiang, Qipeng Xie, Chang Liu, Wenfeng Du, Lu Wang, Kaishun Wu,
Abstract summary: Federated Learning (FL) enables decentralized model training while preserving data privacy.<n>Despite its benefits, FL faces challenges with non-identically distributed (non-IID) data.<n>We propose FedWCM, a method that dynamically adjusts momentum using global and per-round data.
Score: 14.18492489954482
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated Learning (FL) enables decentralized model training while preserving data privacy. Despite its benefits, FL faces challenges with non-identically distributed (non-IID) data, especially in long-tailed scenarios with imbalanced class samples. Momentum-based FL methods, often used to accelerate FL convergence, struggle with these distributions, resulting in biased models and making FL hard to converge. To understand this challenge, we conduct extensive investigations into this phenomenon, accompanied by a layer-wise analysis of neural network behavior. Based on these insights, we propose FedWCM, a method that dynamically adjusts momentum using global and per-round data to correct directional biases introduced by long-tailed distributions. Extensive experiments show that FedWCM resolves non-convergence issues and outperforms existing methods, enhancing FL's efficiency and effectiveness in handling client heterogeneity and data imbalance.

Related papers

FlexFed: Mitigating Catastrophic Forgetting in Heterogeneous Federated Learning in Pervasive Computing Environments [4.358456799125694]
Pervasive computing environments (e.g., for Human Activity Recognition, HAR) are characterized by resource-constrained end devices, streaming sensor data and intermittent client participation.<n>We propose FlexFed, a novel FL approach that prioritizes data retention for efficient memory use and dynamically adjusts offline training frequency.<n>We also develop a realistic HAR-based evaluation framework that simulates streaming data, dynamic distributions, imbalances and varying availability.
arXiv Detail & Related papers (2025-05-19T14:23:37Z)
Distributionally Robust Federated Learning: An ADMM Algorithm [5.65425489838679]
Federated learning (FL) aims to train machine learning (ML) models collaboratively using decentralized data.<n>Standard FL models often assume that all data come from the same unknown distribution.<n>We propose a novel FL model, Distributionally Robust Federated Learning (DRFL), that applies distributionally robust optimization to overcome the challenges posed by data heterogeneity and distributional ambiguity.
arXiv Detail & Related papers (2025-03-24T08:35:38Z)
Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models.<n>Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z)
Client Contribution Normalization for Enhanced Federated Learning [4.726250115737579]
Mobile devices, including smartphones and laptops, generate decentralized and heterogeneous data. Federated Learning (FL) offers a promising alternative by enabling collaborative training of a global model across decentralized devices without data sharing. This paper focuses on data-dependent heterogeneity in FL and proposes a novel approach leveraging mean latent representations extracted from locally trained models.
arXiv Detail & Related papers (2024-11-10T04:03:09Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
Balanced Multi-modal Federated Learning via Cross-Modal Infiltration [19.513099949266156]
Federated learning (FL) underpins advancements in privacy-preserving distributed computing. We propose a novel Cross-Modal Infiltration Federated Learning (FedCMI) framework.
arXiv Detail & Related papers (2023-12-31T05:50:15Z)
Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning [12.50871784200551]
Federated Learning (FL) allows several clients to cooperatively train machine learning models without disclosing the raw data.<n>Non-IID data and stale models pose significant challenges to AFL, as they can diminish the practicality of the global model and even lead to training failures.<n>We propose a novel AFL framework called Federated Historical Learning (FedHist), which effectively addresses the challenges posed by both Non-IID data and gradient staleness.
arXiv Detail & Related papers (2023-12-16T11:40:49Z)
FedWon: Triumphing Multi-domain Federated Learning Without Normalization [50.49210227068574]
Federated learning (FL) enhances data privacy with collaborative in-situ training on decentralized clients. However, Federated learning (FL) encounters challenges due to non-independent and identically distributed (non-i.i.d) data. We propose a novel method called Federated learning Without normalizations (FedWon) to address the multi-domain problem in FL.
arXiv Detail & Related papers (2023-06-09T13:18:50Z)
Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively. The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model. In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z)
Depersonalized Federated Learning: Tackling Statistical Heterogeneity by Alternating Stochastic Gradient Descent [6.394263208820851]
Federated learning (FL) enables devices to train a common machine learning (ML) model for intelligent inference without data sharing. Raw data held by various cooperativelyicipators are always non-identically distributedly. We propose a new FL that can significantly statistical optimize by the de-speed of this process.
arXiv Detail & Related papers (2022-10-07T10:30:39Z)
Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data [10.994226932599403]
We propose a novel diffusion strategy for machine learning (ML) models (FedDif) to maximize the performance of the global model with non-IID data. We show that FedDif improves the top-1 test accuracy by up to 34.89% and reduces communication costs by 14.6% to a maximum of 63.49%.
arXiv Detail & Related papers (2022-07-15T14:28:41Z)
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint. We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG) Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.