Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
- URL: http://arxiv.org/abs/2506.16196v1
- Date: Thu, 19 Jun 2025 10:25:16 GMT
- Title: Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
- Authors: Xun Wang, Jing Xu, Franziska Boenisch, Michael Backes, Christopher A. Choquette-Choo, Adam Dziedzic,
- Abstract summary: POST (Privacy Of Soft prompt Transfer) is a framework that enables private tuning of soft prompts on a small model.<n>It reduces computational costs, preserves privacy, and effectively transfers high-utility soft prompts.<n>Our experiments show that POST reduces computational costs, preserves privacy, and effectively transfers high-utility soft prompts.
- Score: 35.86692074743018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompting has become a dominant paradigm for adapting large language models (LLMs). While discrete (textual) prompts are widely used for their interpretability, soft (parameter) prompts have recently gained traction in APIs. This is because they can encode information from more training samples while minimizing the user's token usage, leaving more space in the context window for task-specific input. However, soft prompts are tightly coupled to the LLM they are tuned on, limiting their generalization to other LLMs. This constraint is particularly problematic for efficiency and privacy: (1) tuning prompts on each LLM incurs high computational costs, especially as LLMs continue to grow in size. Additionally, (2) when the LLM is hosted externally, soft prompt tuning often requires sharing private data with the LLM provider. For instance, this is the case with the NVIDIA NeMo API. To address these issues, we propose POST (Privacy Of Soft prompt Transfer), a framework that enables private tuning of soft prompts on a small model and subsequently transfers these prompts to a larger LLM. POST uses knowledge distillation to derive a small model directly from the large LLM to improve prompt transferability, tunes the soft prompt locally, optionally with differential privacy guarantees, and transfers it back to the larger LLM using a small public dataset. Our experiments show that POST reduces computational costs, preserves privacy, and effectively transfers high-utility soft prompts.
Related papers
- Revisiting Prompt Engineering: A Comprehensive Evaluation for LLM-based Personalized Recommendation [2.3650193864974978]
Large language models (LLMs) can perform recommendation tasks by taking prompts written in natural language as input.<n>This paper focuses on a single-user setting, where no information from other users is used.
arXiv Detail & Related papers (2025-07-17T20:26:00Z) - Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows [1.6163129903911508]
Fine-tuning Small Language Models (SLMs) for real-world applications may no longer be clear.<n>We compare fine-tuning an SLM against prompting LLMs on the task of generating low-code in form.<n>We observe that while a good prompt can yield reasonable results, fine-tuning improves quality by 10% on average.
arXiv Detail & Related papers (2025-05-30T03:59:35Z) - G-Boost: Boosting Private SLMs with General LLMs [27.656951776655045]
Most Large Language Models (LLMs) developers can only fine-tune Small Language Models (SLMs) on their own data.<n>This paper proposes to ask general LLMs for help to boost the performance of private SLMs.
arXiv Detail & Related papers (2025-03-13T13:47:03Z) - From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning [91.79567270986901]
Large Language Models (LLMs) tend to prioritize adherence to user prompts over providing veracious responses.<n>Recent works propose to employ supervised fine-tuning (SFT) to mitigate the sycophancy issue.<n>We propose a novel supervised pinpoint tuning (SPT), where the region-of-interest modules are tuned for a given objective.
arXiv Detail & Related papers (2024-09-03T07:01:37Z) - LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement [79.31084387589968]
Pretrained large language models (LLMs) are currently state-of-the-art for solving the vast majority of natural language processing tasks.
We propose LLM2LLM, a data augmentation strategy that uses a teacher LLM to enhance a small seed dataset.
We achieve improvements up to 24.2% on the GSM8K dataset, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC and 39.8% on SST-2 over regular fine-tuning in the low-data regime.
arXiv Detail & Related papers (2024-03-22T08:57:07Z) - Learning to Compress Prompt in Natural Language Formats [54.06967020905763]
Large language models (LLMs) are great at processing multiple natural language processing tasks.
LLMs are constrained by inferior performance with long context, slow inference speed, and the high cost of computing the results.
This work aims to compress lengthy prompts in the form of natural language with LLM transferability.
arXiv Detail & Related papers (2024-02-28T20:41:21Z) - ConfusionPrompt: Practical Private Inference for Online Large Language Models [3.8134804426693094]
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers.
We introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts.
We show that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques.
arXiv Detail & Related papers (2023-12-30T01:26:42Z) - Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM
Inference with Transferable Prompt [96.24800696597707]
We introduce a new perspective to optimize this trade-off by prompting compressed models.
We propose a soft prompt learning method where we expose the compressed model to the prompt learning process.
Our experimental analysis suggests our soft prompt strategy greatly improves the performance of the 8x compressed LLaMA-7B model.
arXiv Detail & Related papers (2023-05-17T20:45:13Z) - Augmented Large Language Models with Parametric Knowledge Guiding [72.71468058502228]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) with their impressive language understanding and generation capabilities.
Their performance may be suboptimal for domain-specific tasks that require specialized knowledge due to limited exposure to the related data.
We propose the novel Parametric Knowledge Guiding (PKG) framework, which equips LLMs with a knowledge-guiding module to access relevant knowledge.
arXiv Detail & Related papers (2023-05-08T15:05:16Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.