Personalized Collaborative Fine-Tuning for On-Device Large Language Models
- URL: http://arxiv.org/abs/2404.09753v2
- Date: Tue, 6 Aug 2024 21:54:20 GMT
- Title: Personalized Collaborative Fine-Tuning for On-Device Large Language Models
- Authors: Nicolas Wagner, Dongyang Fan, Martin Jaggi,
- Abstract summary: We explore on-device self-supervised collaborative fine-tuning of large language models with limited local data availability.
We introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based.
Our protocols, driven by prediction and performance metrics, surpass both FedAvg and local fine-tuning methods.
- Score: 33.68104398807581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We explore on-device self-supervised collaborative fine-tuning of large language models with limited local data availability. Taking inspiration from the collaborative learning community, we introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based. To minimize communication overhead, we integrate Low-Rank Adaptation (LoRA) and only exchange LoRA weight updates. Our protocols, driven by prediction and performance metrics, surpass both FedAvg and local fine-tuning methods, which is particularly evident in realistic scenarios with more diverse local data distributions. The results underscore the effectiveness of our approach in addressing heterogeneity and scarcity within local datasets.
Related papers
- Modality Alignment Meets Federated Broadcasting [9.752555511824593]
Federated learning (FL) has emerged as a powerful approach to safeguard data privacy by training models across distributed edge devices without centralizing local data.
This paper introduces a novel FL framework leveraging modality alignment, where a text encoder resides on the server, and image encoders operate on local devices.
arXiv Detail & Related papers (2024-11-24T13:30:03Z) - Personalized Federated Learning for Cross-view Geo-localization [49.40531019551957]
We propose a methodology combining Federated Learning (FL) with Cross-view Image Geo-localization (CVGL) techniques.
Our method implements a coarse-to-fine approach, where clients share only the coarse feature extractors while keeping fine-grained features specific to local environments.
Results demonstrate that our federated CVGL method achieves performance close to centralized training while maintaining data privacy.
arXiv Detail & Related papers (2024-11-07T13:25:52Z) - Reducing Spurious Correlation for Federated Domain Generalization [15.864230656989854]
In open-world scenarios, global models may struggle to predict well on entirely new domain data captured by certain media.
Existing methods still rely on strong statistical correlations between samples and labels to address this issue.
We introduce FedCD, an overall optimization framework at both the local and global levels.
arXiv Detail & Related papers (2024-07-27T05:06:31Z) - Dual-Personalizing Adapter for Federated Foundation Models [35.863585349109385]
We propose a new setting, termed test-time personalization, which concentrates on the targeted local task and extends to other tasks that exhibit test-time distribution shifts.
Specifically, a dual-personalizing adapter architecture (FedDPA) is proposed, comprising a global adapter and a local adapter for addressing test-time distribution shifts and personalization.
The effectiveness of the proposed method has been evaluated on benchmark datasets across different NLP tasks.
arXiv Detail & Related papers (2024-03-28T08:19:33Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - FedDisco: Federated Learning with Discrepancy-Aware Collaboration [41.828780724903744]
We propose a novel aggregation method, Federated Learning with Discrepancy-aware Collaboration (FedDisco)
Our FedDisco outperforms several state-of-the-art methods and can be easily incorporated with many existing methods to further enhance the performance.
arXiv Detail & Related papers (2023-05-30T17:20:51Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Clustered Federated Learning via Generalized Total Variation
Minimization [83.26141667853057]
We study optimization methods to train local (or personalized) models for local datasets with a decentralized network structure.
Our main conceptual contribution is to formulate federated learning as total variation minimization (GTV)
Our main algorithmic contribution is a fully decentralized federated learning algorithm.
arXiv Detail & Related papers (2021-05-26T18:07:19Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.