Towards Layer-Wise Personalized Federated Learning: Adaptive Layer Disentanglement via Conflicting Gradients
- URL: http://arxiv.org/abs/2410.02845v1
- Date: Thu, 3 Oct 2024 14:46:19 GMT
- Title: Towards Layer-Wise Personalized Federated Learning: Adaptive Layer Disentanglement via Conflicting Gradients
- Authors: Minh Duong Nguyen, Khanh Le, Khoi Do, Nguyen H. Tran, Duc Nguyen, Chien Trinh, Zhaohui Yang,
- Abstract summary: In personalized Federated Learning (pFL), high data heterogeneity can cause significant gradient divergence across devices.
We introduce a new approach to pFL design, namely Federated Learning with Layer-wise Aggregation via Gradient Analysis (FedLAG)
FedLAG assigns layers for personalization based on the extent of layer-wise gradient conflicts.
- Score: 11.269920973751244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In personalized Federated Learning (pFL), high data heterogeneity can cause significant gradient divergence across devices, adversely affecting the learning process. This divergence, especially when gradients from different users form an obtuse angle during aggregation, can negate progress, leading to severe weight and gradient update degradation. To address this issue, we introduce a new approach to pFL design, namely Federated Learning with Layer-wise Aggregation via Gradient Analysis (FedLAG), utilizing the concept of gradient conflict at the layer level. Specifically, when layer-wise gradients of different clients form acute angles, those gradients align in the same direction, enabling updates across different clients toward identifying client-invariant features. Conversely, when layer-wise gradient pairs make create obtuse angles, the layers tend to focus on client-specific tasks. In hindsights, FedLAG assigns layers for personalization based on the extent of layer-wise gradient conflicts. Specifically, layers with gradient conflicts are excluded from the global aggregation process. The theoretical evaluation demonstrates that when integrated into other pFL baselines, FedLAG enhances pFL performance by a certain margin. Therefore, our proposed method achieves superior convergence behavior compared with other baselines. Extensive experiments show that our FedLAG outperforms several state-of-the-art methods and can be easily incorporated with many existing methods to further enhance performance.
Related papers
- Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.
Non-smooth regularization is often incorporated into machine learning tasks.
We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Fast and Slow Gradient Approximation for Binary Neural Network Optimization [11.064044986709733]
hypernetwork based methods utilize neural networks to learn the gradients of non-differentiable quantization functions.
We propose a Historical Gradient Storage (HGS) module, which models the historical gradient sequence to generate the first-order momentum required for optimization.
We also introduce Layer Recognition Embeddings (LRE) into the hypernetwork, facilitating the generation of layer-specific fine gradients.
arXiv Detail & Related papers (2024-12-16T13:48:40Z) - Hierarchical Federated Learning with Multi-Timescale Gradient Correction [24.713834338757195]
In this paper, we propose a multi-time correction (MTGC) methodology to resolve this issue.
Our key idea is to introduce distinct control to (i) correct the client gradient the group gradient, i.e., to reduce client model drift caused by local updates based on individual datasets.
arXiv Detail & Related papers (2024-09-27T05:10:05Z) - CG-FedLLM: How to Compress Gradients in Federated Fune-tuning for Large Language Models [21.919883617413358]
This study introduces an innovative approach to compress gradients to improve communication efficiency during Large-Language Models (LLMs)
We also present a series of experimental analyses focusing on the signal-to-noise ratio, compression rate, and robustness within this privacy-centric framework.
arXiv Detail & Related papers (2024-05-22T15:32:38Z) - Class Gradient Projection For Continual Learning [99.105266615448]
Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL)
We propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks.
arXiv Detail & Related papers (2023-11-25T02:45:56Z) - Internal Cross-layer Gradients for Extending Homogeneity to
Heterogeneity in Federated Learning [11.29694276480432]
Federated learning (FL) inevitably confronts the challenge of system heterogeneity.
We propose a training scheme that can extend the capabilities of most model-homogeneous FL methods.
arXiv Detail & Related papers (2023-08-22T14:23:21Z) - GIFD: A Generative Gradient Inversion Method with Feature Domain
Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy.
Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge.
We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z) - Towards Better Gradient Consistency for Neural Signed Distance Functions
via Level Set Alignment [50.892158511845466]
We show that gradient consistency in the field, indicated by the parallelism of level sets, is the key factor affecting the inference accuracy.
We propose a level set alignment loss to evaluate the parallelism of level sets, which can be minimized to achieve better gradient consistency.
arXiv Detail & Related papers (2023-05-19T11:28:05Z) - Layerwise Optimization by Gradient Decomposition for Continual Learning [78.58714373218118]
Deep neural networks achieve state-of-the-art and sometimes super-human performance across various domains.
When learning tasks sequentially, the networks easily forget the knowledge of previous tasks, known as "catastrophic forgetting"
arXiv Detail & Related papers (2021-05-17T01:15:57Z) - Boosting Gradient for White-Box Adversarial Attacks [60.422511092730026]
We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms.
Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
arXiv Detail & Related papers (2020-10-21T02:13:26Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.