Related papers: FedTLU: Federated Learning with Targeted Layer Updates

FedTLU: Federated Learning with Targeted Layer Updates

URL: http://arxiv.org/abs/2412.17692v2
Date: Sun, 26 Jan 2025 05:21:54 GMT
Title: FedTLU: Federated Learning with Targeted Layer Updates
Authors: Jong-Ik Park, Carlee Joe-Wong,
Abstract summary: Federated learning (FL) addresses privacy concerns in training language models by enabling multiple clients to contribute to the training, without sending their data to others. Non-IID (identically and independently distributed) data across clients often limits FL's performance. This paper proposes a targeted layer update strategy for fine-tuning in FL.
Score: 12.800116749927266
License:
Abstract: Federated learning (FL) addresses privacy concerns in training language models by enabling multiple clients to contribute to the training, without sending their data to others. However, non-IID (identically and independently distributed) data across clients often limits FL's performance. This issue is especially challenging during model fine-tuning, as noise due to variations in clients' data distributions can harm model convergence near stationary points. This paper proposes a targeted layer update strategy for fine-tuning in FL. Instead of randomly updating layers of the language model, as often done in practice, we use a scoring mechanism to identify and update the most critical layers, avoiding excessively noisy or even poisoned updates by freezing the parameters in other layers. We show in extensive experiments that our method improves convergence and performance in non-IID settings, offering a more efficient approach to fine-tuning federated language models.

Related papers

FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data. Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
MultiConfederated Learning: Inclusive Non-IID Data handling with Decentralized Federated Learning [1.2726316791083532]
Federated Learning (FL) has emerged as a prominent privacy-preserving technique for enabling use cases like confidential clinical machine learning. FL operates by aggregating models trained by remote devices which owns the data. We propose MultiConfederated Learning: a decentralized FL framework which is designed to handle non-IID data.
arXiv Detail & Related papers (2024-04-20T16:38:26Z)
Tunable Soft Prompts are Messengers in Federated Learning [55.924749085481544]
Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources. The lack of model privacy protection in FL becomes an unneglectable challenge. We propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
arXiv Detail & Related papers (2023-11-12T11:01:10Z)
Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection. We find that the difference in logits between the local and global models increases as the model is continuously updated. We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
Byzantine-robust Federated Learning through Spatial-temporal Analysis of Local Model Updates [6.758334200305236]
Federated Learning (FL) enables multiple distributed clients (e.g., mobile devices) to collaboratively train a centralized model while keeping the training data locally on the client. In this paper, we propose to mitigate these failures and attacks from a spatial-temporal perspective. Specifically, we use a clustering-based method to detect and exclude incorrect updates by leveraging their geometric properties in the parameter space.
arXiv Detail & Related papers (2021-07-03T18:48:11Z)
Analysis and Optimal Edge Assignment For Hierarchical Federated Learning on Non-IID Data [43.32085029569374]
Federated learning algorithms aim to leverage distributed and diverse data stored at users' devices to learn a global phenomena. In the cases where the participants' data are strongly skewed (i.e., non-IID), the local models can overfit local data, leading to low performing global model. We propose a hierarchical learning system that performs Federated Gradient Descent on the user-edge layer and Federated Averaging on the edge-cloud layer.
arXiv Detail & Related papers (2020-12-10T12:18:13Z)
Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models. We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm. We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z)
WAFFLe: Weight Anonymized Factorization for Federated Learning [88.44939168851721]
In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices. We propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks.
arXiv Detail & Related papers (2020-08-13T04:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.