OpenFedLLM: Training Large Language Models on Decentralized Private Data
via Federated Learning
- URL: http://arxiv.org/abs/2402.06954v1
- Date: Sat, 10 Feb 2024 13:50:11 GMT
- Title: OpenFedLLM: Training Large Language Models on Decentralized Private Data
via Federated Learning
- Authors: Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin
Du, Yanfeng Wang, Siheng Chen
- Abstract summary: Large language models (LLMs) have demonstrated tremendous success across various fields.
In this paper, we offer a potential next step for contemporary LLM training on the underutilized distributed private data via federated learning (FL)
We build a concise, integrated, and research-friendly framework/codebase, named OpenFedLLM.
It covers federated instruction tuning for enhancing instruction-following capability, federated value alignment for aligning with human values, and 7 representative FL algorithms.
- Score: 44.200613313936024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trained on massive publicly available data, large language models (LLMs) have
demonstrated tremendous success across various fields. While more data
contributes to better performance, a disconcerting reality is that high-quality
public data will be exhausted in a few years. In this paper, we offer a
potential next step for contemporary LLMs: collaborative and privacy-preserving
LLM training on the underutilized distributed private data via federated
learning (FL), where multiple data owners collaboratively train a shared model
without transmitting raw data. To achieve this, we build a concise, integrated,
and research-friendly framework/codebase, named OpenFedLLM. It covers federated
instruction tuning for enhancing instruction-following capability, federated
value alignment for aligning with human values, and 7 representative FL
algorithms. Besides, OpenFedLLM supports training on diverse domains, where we
cover 8 training datasets; and provides comprehensive evaluations, where we
cover 30+ evaluation metrics. Through extensive experiments, we observe that
all FL algorithms outperform local training on training LLMs, demonstrating a
clear performance improvement across a variety of settings. Notably, in a
financial benchmark, Llama2-7B fine-tuned by applying any FL algorithm can
outperform GPT-4 by a significant margin while the model obtained through
individual training cannot, demonstrating strong motivation for clients to
participate in FL. The code is available at
https://github.com/rui-ye/OpenFedLLM.
Related papers
- Data Quality Control in Federated Instruction-tuning of Large Language Models [43.29678396558287]
We propose a new framework of federated instruction tuning of large language models (LLMs) with data quality control (FedDQC)
Our approach introduces an efficient metric to assess each client's instruction-response alignment (IRA), identifying potentially noisy data through single-shot inference.
We conduct extensive experiments on 4 synthetic and a real-world dataset, and compare our method with baselines adapted from centralized setting.
arXiv Detail & Related papers (2024-10-15T12:14:57Z) - Embracing Federated Learning: Enabling Weak Client Participation via Partial Model Training [21.89214794178211]
In Federated Learning (FL), clients may have weak devices that cannot train the full model or even hold it in their memory space.
We propose EmbracingFL, a general FL framework that allows all available clients to join the distributed training.
Our empirical study shows that EmbracingFL consistently achieves high accuracy as like all clients are strong, outperforming the state-of-the-art width reduction methods.
arXiv Detail & Related papers (2024-06-21T13:19:29Z) - FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models [48.484485609995986]
Federated learning has enabled multiple parties to collaboratively train large language models without directly sharing their data (FedLLM)
There are currently no realistic datasets and benchmarks for FedLLM.
We propose FedLLM-Bench, which involves 8 training methods, 4 training datasets, and 6 evaluation metrics.
arXiv Detail & Related papers (2024-06-07T11:19:30Z) - Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data [10.64629029156029]
We introduce an innovative personalized Federated Learning framework, Multi-level Personalized Federated Learning (MuPFL)
MuPFL integrates three pivotal modules: Biased Activation Value Dropout (BAVD), Adaptive Cluster-based Model Update (ACMU) and Prior Knowledge-assisted Fine-tuning (PKCF)
Experiments on diverse real-world datasets show that MuPFL consistently outperforms state-of-the-art baselines, even under extreme non-i.i.d. and long-tail conditions.
arXiv Detail & Related papers (2024-05-10T11:52:53Z) - A Survey on Efficient Federated Learning Methods for Foundation Model Training [62.473245910234304]
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
In the wake of Foundation Models (FM), the reality is different for many deep learning applications.
We discuss the benefits and drawbacks of parameter-efficient fine-tuning (PEFT) for FL applications.
arXiv Detail & Related papers (2024-01-09T10:22:23Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - Federated Multilingual Models for Medical Transcript Analysis [11.877236847857336]
We present a federated learning system for training a large-scale multi-lingual model.
None of the training data is ever transmitted to any central location.
We show that the global model performance can be further improved by a training step performed locally.
arXiv Detail & Related papers (2022-11-04T01:07:54Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Multi-Center Federated Learning [62.32725938999433]
Federated learning (FL) can protect data privacy in distributed learning.
It merely collects local gradients from users without access to their data.
We propose a novel multi-center aggregation mechanism.
arXiv Detail & Related papers (2021-08-19T12:20:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.