Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data
- URL: http://arxiv.org/abs/2405.14212v1
- Date: Thu, 23 May 2024 06:14:35 GMT
- Title: Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data
- Authors: Haoran Li, Xinyuan Zhao, Dadi Guo, Hanlin Gu, Ziqian Zeng, Yuxing Han, Yangqiu Song, Lixin Fan, Qiang Yang,
- Abstract summary: We introduce a Federated Domain-specific Knowledge Transfer framework.
It enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy.
The proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5% with a privacy budget of less than 10.
- Score: 53.70870879858533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As large language models (LLMs) demonstrate unparalleled performance and generalization ability, LLMs are widely used and integrated into various applications. When it comes to sensitive domains, as commonly described in federated learning scenarios, directly using external LLMs on private data is strictly prohibited by stringent data security and privacy regulations. For local clients, the utilization of LLMs to improve the domain-specific small language models (SLMs), characterized by limited computational resources and domain-specific data, has attracted considerable research attention. By observing that LLMs can empower domain-specific SLMs, existing methods predominantly concentrate on leveraging the public data or LLMs to generate more data to transfer knowledge from LLMs to SLMs. However, due to the discrepancies between LLMs' generated data and clients' domain-specific data, these methods cannot yield substantial improvements in the domain-specific tasks. In this paper, we introduce a Federated Domain-specific Knowledge Transfer (FDKT) framework, which enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy. The core insight is to leverage LLMs to augment data based on domain-specific few-shot demonstrations, which are synthesized from private domain data using differential privacy. Such synthetic samples share similar data distribution with clients' private data and allow the server LLM to generate particular knowledge to improve clients' SLMs. The extensive experimental results demonstrate that the proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5\% with a privacy budget of less than 10, compared to local training on private data.
Related papers
- FedCoLLM: A Parameter-Efficient Federated Co-tuning Framework for Large and Small Language Models [24.579015114518157]
FedCoLLM is a novel framework designed for co-tuning Large Language Models (LLMs) and Small Language Models (SLMs)
FedCoLLM adaptively transfers server-side LLMs knowledge to clients' SLMs while simultaneously enriching the LLMs with domain insights from the clients.
Our evaluation of FedCoLLM, utilizing various public LLMs and SLMs across a range of NLP text generation tasks, reveals notable improvements with the assistance of the LLMs.
arXiv Detail & Related papers (2024-11-18T16:34:58Z) - Model-Based Differentially Private Knowledge Transfer for Large Language Models [34.949731264918846]
We propose textitLlamdex, a framework that integrates privacy-preserving, domain-specific models into large language models.
Our approach significantly enhances the accuracy of domain-specific tasks, achieving up to a 26% improvement compared to existing methods.
arXiv Detail & Related papers (2024-10-14T13:18:20Z) - The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective [53.48484062444108]
We find that the development of models and data is not two separate paths but rather interconnected.
On the one hand, vaster and higher-quality data contribute to better performance of MLLMs; on the other hand, MLLMs can facilitate the development of data.
To promote the data-model co-development for MLLM community, we systematically review existing works related to MLLMs from the data-model co-development perspective.
arXiv Detail & Related papers (2024-07-11T15:08:11Z) - FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models [28.284346666217207]
FedMKT is a parameter-efficient mutual knowledge transfer framework for large and small language models.
We show that FedMKT simultaneously boosts the performance of both LLMs and SLMs.
arXiv Detail & Related papers (2024-06-04T11:36:09Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models.
One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets.
We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA)
Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z) - EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models
with Semi-structured Data [67.8302955948861]
Large Language Models (LLMs) pre-trained on massive corpora have exhibited remarkable performance on various NLP tasks.
Applying these models to specific domains still poses significant challenges, such as lack of domain knowledge.
We focus on domain-specific continual pre-training of LLMs using E-commerce domain as an exemplar.
arXiv Detail & Related papers (2023-12-25T11:31:47Z) - Mutual Enhancement of Large and Small Language Models with Cross-Silo
Knowledge Transfer [27.63746419563747]
Large language models (LLMs) are empowered with broad knowledge, but their task-specific performance is often suboptimal.
It necessitates fine-tuning LLMs with task-specific data, but such data may be inaccessible due to privacy concerns.
We propose a novel approach to enhance LLMs with smaller language models (SLMs) that are trained on clients using their private task-specific data.
arXiv Detail & Related papers (2023-12-10T09:52:32Z) - Augmented Large Language Models with Parametric Knowledge Guiding [72.71468058502228]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) with their impressive language understanding and generation capabilities.
Their performance may be suboptimal for domain-specific tasks that require specialized knowledge due to limited exposure to the related data.
We propose the novel Parametric Knowledge Guiding (PKG) framework, which equips LLMs with a knowledge-guiding module to access relevant knowledge.
arXiv Detail & Related papers (2023-05-08T15:05:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.