Related papers: PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation

PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation

URL: http://arxiv.org/abs/2502.15857v1
Date: Fri, 21 Feb 2025 07:32:49 GMT
Title: PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation
Authors: Tao Fan, Guoqiang Ma, Yuanfeng Song, Lixin Fan, Kai Chen, Qiang Yang,
Abstract summary: PPC-GPT is a privacy-preserving framework for compressing Large Language Models into task-specific Small Language Models.<n>We show that PPC-GPT achieves competitive performance and prioritizes data privacy protection.
Score: 26.127863923240408
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Compressing Large Language Models (LLMs) into task-specific Small Language Models (SLMs) encounters two significant challenges: safeguarding domain-specific knowledge privacy and managing limited resources. To tackle these challenges, we propose PPC-GPT, a innovative privacy-preserving federated framework specifically designed for compressing LLMs into task-specific SLMs via pruning and Chain-of-Thought (COT) distillation. PPC-GPT works on a server-client federated architecture, where the client sends differentially private (DP) perturbed task-specific data to the server's LLM. The LLM then generates synthetic data along with their corresponding rationales. This synthetic data is subsequently used for both LLM pruning and retraining processes. Additionally, we harness COT knowledge distillation, leveraging the synthetic data to further improve the retraining of structurally-pruned SLMs. Our experimental results demonstrate the effectiveness of PPC-GPT across various text generation tasks. By compressing LLMs into task-specific SLMs, PPC-GPT not only achieves competitive performance but also prioritizes data privacy protection.

Related papers

Agentic Privacy-Preserving Machine Learning [5.695349155812586]
Privacy-preserving machine learning (PPML) is critical to ensure data privacy in AI.<n>We propose a novel framework named Agentic-PPML to make PPML in LLMs practical.
arXiv Detail & Related papers (2025-07-30T08:20:45Z)
Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission [87.68447072141402]
Hybrid Language Models (HLMs) combine the low-latency efficiency of Small Language Models (SLMs) on edge devices with the high accuracy of Large Language Models (LLMs) on centralized servers.<n>We propose FedHLM, a communication-efficient HLM framework that integrates uncertainty-aware inference with Federated Learning (FL)
arXiv Detail & Related papers (2025-06-30T02:56:11Z)
Learning Obfuscations Of LLM Embedding Sequences: Stained Glass Transform [1.8749305679160366]
We introduce the Stained Glass Transform, a learned, sequence dependent transformation of the word embeddings of an AI model.<n>We calculate a-postiori privacy estimates, based on mutual information, and verify the privacy and utility of instances of transformed embeddings.
arXiv Detail & Related papers (2025-06-11T06:56:12Z)
Federated In-Context LLM Agent Learning [3.4757641432843487]
Large Language Models (LLMs) have revolutionized intelligent services by enabling logical reasoning, tool use, and interaction with external systems as agents. In this paper, we propose a novel privacy-preserving Federated In-context LLM Agent Learning (FICAL) algorithm. The results show that FICAL has competitive performance compared to other SOTA baselines with a significant communication cost decrease of $mathbf3.33times105$ times.
arXiv Detail & Related papers (2024-12-11T03:00:24Z)
HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection [44.225151701532454]
In this paper, we introduce a new framework HARMONIC for tabular data generation and evaluation. Our framework achieves equivalent performance to existing methods with better privacy, which also demonstrates our evaluation framework for the effectiveness of synthetic data and privacy risks.
arXiv Detail & Related papers (2024-08-06T03:21:13Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
PDSS: A Privacy-Preserving Framework for Step-by-Step Distillation of Large Language Models [29.58928014528991]
PDSS works on a server-client architecture, wherein client transmits prompts to the server's LLM for rationale generation. The generated rationales are then decoded by the client and used to enrich the training of task-specific small language model. Experiments demonstrate the effectiveness of PDSS in various text generation tasks, enabling the training of task-specific SLM with enhanced performance.
arXiv Detail & Related papers (2024-06-18T08:48:14Z)
Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently. Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting. We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z)
Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data [53.70870879858533]
We introduce a Federated Domain-specific Knowledge Transfer framework. It enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy. The proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5% with a privacy budget of less than 10.
arXiv Detail & Related papers (2024-05-23T06:14:35Z)
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion [96.47420221442397]
We construct adversarial user instructions by attacking user instructions at sentence, semantic, and multi-language levels. We test 3 closed-source and 4 open-source LLMs using a benchmark that incorporates robustness settings. We find that GPT-4 exhibits the highest performance and strong robustness in our benchmark.
arXiv Detail & Related papers (2024-03-06T15:33:32Z)
Differentially Private Knowledge Distillation via Synthetic Text Generation [5.201318326501886]
We propose DistilDP: a novel differentially private knowledge distillation algorithm. DistilDP exploits synthetic data generated by a differentially private teacher LLM. Our experimental results demonstrate that DistilDP can substantially improve the utility over existing baselines.
arXiv Detail & Related papers (2024-03-01T19:22:24Z)
PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models [6.622419351156256]
Pre-trained language models (PLMs) show impressive performance in various downstream NLP tasks. Due to the substantial resources required, many PLM weights are confidential. We introduce Plug-in External Memory Adaptation (PEMA), a. Efficient Fine-Tuning (PEFT) method, enabling fine-tuning without requiring access to all the weights.
arXiv Detail & Related papers (2023-11-14T23:20:51Z)
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution. We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios. We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z)
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models [11.845239346943067]
parameter-efficient fine-tuning (PEFT) is a promising approach to efficiently specialize large language models (LLMs) to task-specific data.<n>Our study highlights the potential for tuning larger LLMs and significant reductions in memory usage by combining PEFT with quantization.
arXiv Detail & Related papers (2023-08-21T04:31:06Z)
LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.