FedTLU: Federated Learning with Targeted Layer Updates
- URL: http://arxiv.org/abs/2412.17692v2
- Date: Sun, 26 Jan 2025 05:21:54 GMT
- Title: FedTLU: Federated Learning with Targeted Layer Updates
- Authors: Jong-Ik Park, Carlee Joe-Wong,
- Abstract summary: Federated learning (FL) addresses privacy concerns in training language models by enabling multiple clients to contribute to the training, without sending their data to others.<n>Non-IID (identically and independently distributed) data across clients often limits FL's performance.<n>This paper proposes a targeted layer update strategy for fine-tuning in FL.
- Score: 12.800116749927266
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Federated learning (FL) addresses privacy concerns in training language models by enabling multiple clients to contribute to the training, without sending their data to others. However, non-IID (identically and independently distributed) data across clients often limits FL's performance. This issue is especially challenging during model fine-tuning, as noise due to variations in clients' data distributions can harm model convergence near stationary points. This paper proposes a targeted layer update strategy for fine-tuning in FL. Instead of randomly updating layers of the language model, as often done in practice, we use a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers. We show in extensive experiments that our method improves convergence and performance in non-IID settings, offering a more efficient approach to fine-tuning federated language models.
Related papers
- Achieving Byzantine-Resilient Federated Learning via Layer-Adaptive Sparsified Model Aggregation [7.200910949076064]
Federated Learning (FL) enables multiple clients to collaboratively train a model without sharing their local data.
Yet the FL system is vulnerable to well-designed Byzantine attacks, which aim to disrupt the model training process by uploading malicious model updates.
We propose the Layer-Adaptive Sparsified Model Aggregation (LASA) approach, which combines pre-aggregation sparsification with layer-wise adaptive aggregation to improve robustness.
arXiv Detail & Related papers (2024-09-02T19:28:35Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - MultiConfederated Learning: Inclusive Non-IID Data handling with Decentralized Federated Learning [1.2726316791083532]
Federated Learning (FL) has emerged as a prominent privacy-preserving technique for enabling use cases like confidential clinical machine learning.
FL operates by aggregating models trained by remote devices which owns the data.
We propose MultiConfederated Learning: a decentralized FL framework which is designed to handle non-IID data.
arXiv Detail & Related papers (2024-04-20T16:38:26Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees [18.24213566328972]
Decentralized decentralized learning (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are carried out by the clients without a central server.
DSpodFL consistently achieves speeds compared with baselines under various system settings.
arXiv Detail & Related papers (2024-02-05T19:02:19Z) - FLea: Addressing Data Scarcity and Label Skew in Federated Learning via Privacy-preserving Feature Augmentation [15.298650496155508]
Federated Learning (FL) enables model development by leveraging data distributed across numerous edge devices without transferring local data to a central server.
Existing FL methods face challenges when dealing with scarce and label-skewed data across devices, resulting in local model overfitting and drift.
We propose a pioneering framework called textitFLea, incorporating the following key components.
arXiv Detail & Related papers (2023-12-04T20:24:09Z) - On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based
Multilingual Model [49.81429697921861]
We study the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models.
We show that prompt tuning is more effective in enhancing the performance of low-resource languages than fine-tuning.
arXiv Detail & Related papers (2023-11-14T00:43:33Z) - Tunable Soft Prompts are Messengers in Federated Learning [55.924749085481544]
Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources.
The lack of model privacy protection in FL becomes an unneglectable challenge.
We propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
arXiv Detail & Related papers (2023-11-12T11:01:10Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Confidence-aware Personalized Federated Learning via Variational
Expectation Maximization [34.354154518009956]
We present a novel framework for personalized Federated Learning (PFL)
PFL is a distributed learning scheme to train a shared model across clients.
We present a novel framework for PFL based on hierarchical modeling and variational inference.
arXiv Detail & Related papers (2023-05-21T20:12:27Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Byzantine-robust Federated Learning through Spatial-temporal Analysis of
Local Model Updates [6.758334200305236]
Federated Learning (FL) enables multiple distributed clients (e.g., mobile devices) to collaboratively train a centralized model while keeping the training data locally on the client.
In this paper, we propose to mitigate these failures and attacks from a spatial-temporal perspective.
Specifically, we use a clustering-based method to detect and exclude incorrect updates by leveraging their geometric properties in the parameter space.
arXiv Detail & Related papers (2021-07-03T18:48:11Z) - Analysis and Optimal Edge Assignment For Hierarchical Federated Learning
on Non-IID Data [43.32085029569374]
Federated learning algorithms aim to leverage distributed and diverse data stored at users' devices to learn a global phenomena.
In the cases where the participants' data are strongly skewed (i.e., non-IID), the local models can overfit local data, leading to low performing global model.
We propose a hierarchical learning system that performs Federated Gradient Descent on the user-edge layer and Federated Averaging on the edge-cloud layer.
arXiv Detail & Related papers (2020-12-10T12:18:13Z) - Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models.
We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm.
We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z) - WAFFLe: Weight Anonymized Factorization for Federated Learning [88.44939168851721]
In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices.
We propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks.
arXiv Detail & Related papers (2020-08-13T04:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.