PFID: Privacy First Inference Delegation Framework for LLMs
- URL: http://arxiv.org/abs/2406.12238v1
- Date: Tue, 18 Jun 2024 03:27:09 GMT
- Title: PFID: Privacy First Inference Delegation Framework for LLMs
- Authors: Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao,
- Abstract summary: This paper introduces a novel privacy-preservation framework named PFID for LLMs.
It addresses critical privacy concerns by localizing user data through model sharding and singular value decomposition.
- Score: 34.59282305562392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel privacy-preservation framework named PFID for LLMs that addresses critical privacy concerns by localizing user data through model sharding and singular value decomposition. When users are interacting with LLM systems, their prompts could be subject to being exposed to eavesdroppers within or outside LLM system providers who are interested in collecting users' input. In this work, we proposed a framework to camouflage user input, so as to alleviate privacy issues. Our framework proposes to place model shards on the client and the public server, we sent compressed hidden states instead of prompts to and from servers. Clients have held back information that can re-privatized the hidden states so that overall system performance is comparable to traditional LLMs services. Our framework was designed to be communication efficient, computation can be delegated to the local client so that the server's computation burden can be lightened. We conduct extensive experiments on machine translation tasks to verify our framework's performance.
Related papers
- Safely Learning with Private Data: A Federated Learning Framework for Large Language Model [3.1077263218029105]
Federated learning (FL) is an ideal solution for training models with distributed private data.
Traditional frameworks like FedAvg are unsuitable for large language models (LLM)
We propose FL-GLM, which prevents data leakage caused by both server-side and peer-client attacks.
arXiv Detail & Related papers (2024-06-21T06:43:15Z) - Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data [53.70870879858533]
We introduce a Federated Domain-specific Knowledge Transfer framework.
It enables domain-specific knowledge transfer from LLMs to SLMs while preserving clients' data privacy.
The proposed FDKT framework consistently and greatly improves SLMs' task performance by around 5% with a privacy budget of less than 10.
arXiv Detail & Related papers (2024-05-23T06:14:35Z) - Can LLMs get help from other LLMs without revealing private information? [13.687385548330717]
Cascades are a common type of machine learning system in which a large, remote model can be queried if a local model is not able to label a user's data by itself.
We show the feasibility of applying cascade systems in such setups by equipping the local model with privacy-preserving techniques.
arXiv Detail & Related papers (2024-04-01T10:54:49Z) - ConfusionPrompt: Practical Private Inference for Online Large Language Models [11.26620418652188]
Large language models (LLMs) are commonly deployed as online services, necessitating users to transmit informative prompts to cloud servers.
We present ConfusionPrompt, a novel private LLM inference framework designed to obfuscate the server by decomposing the prompt into sub-prompts.
We develop a $(lambda, mu, rho)$-privacy model to formulate the requirement for a privacy-preserving group of prompts.
arXiv Detail & Related papers (2023-12-30T01:26:42Z) - Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection [6.201275002179716]
We introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core processes: hiding private entities for anonymization and seeking private entities for de-anonymization.
To quantitatively assess HaS's privacy protection performance, we propose both black-box and white-box adversarial models.
arXiv Detail & Related papers (2023-09-06T14:54:11Z) - Privacy-Preserving Prompt Tuning for Large Language Model Services [16.589104544849743]
We propose a framework that provides privacy guarantees for Large Language Models (LLMs) services.
textscrapt adopts a local privacy setting, allowing users to privatize their data locally with local differential privacy.
Despite the simplicity of our framework, experiments show that RAPT achieves competitive performance across tasks while providing privacy guarantees against adversaries.
arXiv Detail & Related papers (2023-05-10T14:41:51Z) - Augmented Large Language Models with Parametric Knowledge Guiding [72.71468058502228]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) with their impressive language understanding and generation capabilities.
Their performance may be suboptimal for domain-specific tasks that require specialized knowledge due to limited exposure to the related data.
We propose the novel Parametric Knowledge Guiding (PKG) framework, which equips LLMs with a knowledge-guiding module to access relevant knowledge.
arXiv Detail & Related papers (2023-05-08T15:05:16Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z) - Federated Nearest Neighbor Machine Translation [66.8765098651988]
In this paper, we propose a novel federated nearest neighbor (FedNN) machine translation framework.
FedNN leverages one-round memorization-based interaction to share knowledge across different clients.
Experiments show that FedNN significantly reduces computational and communication costs compared with FedAvg.
arXiv Detail & Related papers (2023-02-23T18:04:07Z) - Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM [62.62684911017472]
Federated learning (FL) enables devices to jointly train shared models while keeping the training data local for privacy purposes.
We introduce a VFL framework with multiple heads (VIM), which takes the separate contribution of each client into account.
VIM achieves significantly higher performance and faster convergence compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-20T23:14:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.