PFID: Privacy First Inference Delegation Framework for LLMs
- URL: http://arxiv.org/abs/2406.12238v1
- Date: Tue, 18 Jun 2024 03:27:09 GMT
- Title: PFID: Privacy First Inference Delegation Framework for LLMs
- Authors: Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao,
- Abstract summary: This paper introduces a novel privacy-preservation framework named PFID for LLMs.
It addresses critical privacy concerns by localizing user data through model sharding and singular value decomposition.
- Score: 34.59282305562392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel privacy-preservation framework named PFID for LLMs that addresses critical privacy concerns by localizing user data through model sharding and singular value decomposition. When users are interacting with LLM systems, their prompts could be subject to being exposed to eavesdroppers within or outside LLM system providers who are interested in collecting users' input. In this work, we proposed a framework to camouflage user input, so as to alleviate privacy issues. Our framework proposes to place model shards on the client and the public server, we sent compressed hidden states instead of prompts to and from servers. Clients have held back information that can re-privatized the hidden states so that overall system performance is comparable to traditional LLMs services. Our framework was designed to be communication efficient, computation can be delegated to the local client so that the server's computation burden can be lightened. We conduct extensive experiments on machine translation tasks to verify our framework's performance.
Related papers
- A Practical and Privacy-Preserving Framework for Real-World Large Language Model Services [8.309281698695381]
Large language models (LLMs) have demonstrated exceptional capabilities in text understanding and generation.
Individuals often rely on online AI as a Service (AI) provided by LLM companies.
This business model poses significant privacy risks, as service providers may exploit users' trace patterns and behavioral data.
We propose a practical and privacy-preserving framework that ensures user anonymity by preventing service providers from linking requests to the individuals who submit them.
arXiv Detail & Related papers (2024-11-03T07:40:28Z) - PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles [21.340456482528136]
We propose Privacy-Conscious Delegation, a novel task for chaining API-based and local models.
We utilize recent public collections of user-LLM interactions to construct a natural benchmark called PUPA.
Our best pipeline maintains high response quality for 85.5% of user queries while restricting privacy leakage to only 7.5%.
arXiv Detail & Related papers (2024-10-22T16:00:26Z) - KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server [48.04903443425111]
Large language models (LLMs) facilitate many parties to fine-tune LLMs on their own private data.
Existing solutions, such as utilizing synthetic data for substitution, struggle to simultaneously improve performance and preserve privacy.
We propose KnowledgeSG, a novel client-server framework which enhances synthetic data quality and improves model performance while ensuring privacy.
arXiv Detail & Related papers (2024-10-08T06:42:28Z) - Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data [53.70870879858533]
We introduce a Federated Domain-specific Knowledge Transfer framework.
It enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy.
The proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5% with a privacy budget of less than 10.
arXiv Detail & Related papers (2024-05-23T06:14:35Z) - ConfusionPrompt: Practical Private Inference for Online Large Language Models [3.8134804426693094]
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers.
We introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts.
We show that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques.
arXiv Detail & Related papers (2023-12-30T01:26:42Z) - Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection [6.201275002179716]
We introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core processes: hiding private entities for anonymization and seeking private entities for de-anonymization.
To quantitatively assess HaS's privacy protection performance, we propose both black-box and white-box adversarial models.
arXiv Detail & Related papers (2023-09-06T14:54:11Z) - Privacy-Preserving Prompt Tuning for Large Language Model Services [16.589104544849743]
We propose a framework that provides privacy guarantees for Large Language Models (LLMs) services.
textscrapt adopts a local privacy setting, allowing users to privatize their data locally with local differential privacy.
Despite the simplicity of our framework, experiments show that RAPT achieves competitive performance across tasks while providing privacy guarantees against adversaries.
arXiv Detail & Related papers (2023-05-10T14:41:51Z) - Augmented Large Language Models with Parametric Knowledge Guiding [72.71468058502228]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) with their impressive language understanding and generation capabilities.
Their performance may be suboptimal for domain-specific tasks that require specialized knowledge due to limited exposure to the related data.
We propose the novel Parametric Knowledge Guiding (PKG) framework, which equips LLMs with a knowledge-guiding module to access relevant knowledge.
arXiv Detail & Related papers (2023-05-08T15:05:16Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z) - Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM [62.62684911017472]
Federated learning (FL) enables devices to jointly train shared models while keeping the training data local for privacy purposes.
We introduce a VFL framework with multiple heads (VIM), which takes the separate contribution of each client into account.
VIM achieves significantly higher performance and faster convergence compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-20T23:14:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.