Towards Federated Low-Rank Adaptation of Language Models with Rank Heterogeneity
- URL: http://arxiv.org/abs/2406.17477v3
- Date: Sat, 15 Feb 2025 08:24:09 GMT
- Title: Towards Federated Low-Rank Adaptation of Language Models with Rank Heterogeneity
- Authors: Yuji Byun, Jaeho Lee,
- Abstract summary: We observe that heterogeneous ranks among clients lead to unstable performance.
Our analysis attributes this instability to the conventional zero-padding aggregation strategy.
We propose a replication-based padding strategy that better retains valuable information from clients with high-quality data.
- Score: 12.515874333424929
- License:
- Abstract: Low-rank adaptation (LoRA) offers an efficient alternative to full-weight adaptation in federated fine-tuning of language models, significantly reducing computational costs. By adjusting ranks for each client, federated LoRA enables flexible resource allocation. However, we observe that heterogeneous ranks among clients lead to unstable performance. Our analysis attributes this instability to the conventional zero-padding aggregation strategy, which dilutes information from high-rank clients during model aggregation. To address this issue, we propose a replication-based padding strategy that better retains valuable information from clients with high-quality data. Empirically, this approach accelerates convergence and enhances the global model's predictive performance.
Related papers
- Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.
We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z) - Robust Federated Learning in the Face of Covariate Shift: A Magnitude Pruning with Hybrid Regularization Framework for Enhanced Model Aggregation [1.519321208145928]
Federated Learning (FL) offers a promising framework for individuals aiming to collaboratively develop a shared model.
variations in data distribution among clients can profoundly affect FL methodologies, primarily due to instabilities in the aggregation process.
We propose a novel FL framework, combining individual parameter pruning and regularization techniques to improve the robustness of individual clients' models to aggregate.
arXiv Detail & Related papers (2024-12-19T16:22:37Z) - FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning [12.307490659840845]
Federated Learning (FL) combines locally optimized models from various clients into a unified global model.
FL encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model.
We introduce an innovative dual-strategy approach designed to effectively resolve these issues.
arXiv Detail & Related papers (2024-12-05T18:42:29Z) - Federated LLMs Fine-tuned with Adaptive Importance-Aware LoRA [24.871424801066006]
Federated fine-tuning of Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving data privacy.
We propose a novel Heterogeneous Adaptive Federated Low-Rank Adaptation (LoRA) fine-tuned LLM framework (HAFL)
Our method converges quickly with low communication size, and avoids performance degradation when distributing models to clients.
arXiv Detail & Related papers (2024-11-10T19:59:54Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Federated Learning for Semantic Parsing: Task Formulation, Evaluation
Setup, New Algorithms [29.636944156801327]
Multiple clients collaboratively train one global model without sharing their semantic parsing data.
Lorar adjusts each client's contribution to the global model update based on its training loss reduction during each round.
Clients with smaller datasets enjoy larger performance gains.
arXiv Detail & Related papers (2023-05-26T19:25:49Z) - Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape [59.841889495864386]
In federated learning (FL), a cluster of local clients are chaired under the coordination of a global server.
Clients are prone to overfit into their own optima, which extremely deviates from the global objective.
ttfamily FedSMOO adopts a dynamic regularizer to guarantee the local optima towards the global objective.
Our theoretical analysis indicates that ttfamily FedSMOO achieves fast $mathcalO (1/T)$ convergence rate with low bound generalization.
arXiv Detail & Related papers (2023-05-19T10:47:44Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Communication-Efficient Federated Learning with Accelerated Client Gradient [46.81082897703729]
Federated learning often suffers from slow and unstable convergence due to the heterogeneous characteristics of participating client datasets.
We propose a simple but effective federated learning framework, which improves the consistency across clients and facilitates the convergence of the server model.
We provide the theoretical convergence rate of our algorithm and demonstrate remarkable performance gains in terms of accuracy and communication efficiency.
arXiv Detail & Related papers (2022-01-10T05:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.