Related papers: PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration

PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration

URL: http://arxiv.org/abs/2406.01394v1
Date: Mon, 3 Jun 2024 14:57:39 GMT
Title: PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration
Authors: Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Huiping Zhuang, Cen Chen,
Abstract summary: We propose PrivacyRestore to protect the privacy of user inputs during Large Language Models inference. PrivacyRestore directly removes privacy spans in user inputs and restores privacy information via activation steering during inference. Experiments show that PrivacyRestore can protect private information while maintaining acceptable levels of performance and inference efficiency.
Score: 18.67432819687349
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The widespread usage of online Large Language Models (LLMs) inference services has raised significant privacy concerns about the potential exposure of private information in user inputs to eavesdroppers or untrustworthy service providers. Existing privacy protection methods for LLMs suffer from insufficient privacy protection, performance degradation, or severe inference time overhead. In this paper, we propose PrivacyRestore to protect the privacy of user inputs during LLM inference. PrivacyRestore directly removes privacy spans in user inputs and restores privacy information via activation steering during inference. The privacy spans are encoded as restoration vectors. We propose Attention-aware Weighted Aggregation (AWA) which aggregates restoration vectors of all privacy spans in the input into a meta restoration vector. AWA not only ensures proper representation of all privacy spans but also prevents attackers from inferring the privacy spans from the meta restoration vector alone. This meta restoration vector, along with the query with privacy spans removed, is then sent to the server. The experimental results show that PrivacyRestore can protect private information while maintaining acceptable levels of performance and inference efficiency.

Related papers

Inference Privacy: Properties and Mechanisms [8.471466670802817]
Inference Privacy (IP) can allow a user to interact with a model while providing a rigorous privacy guarantee for the users' data at inference. We present two types of mechanisms for achieving IP: namely, input perturbations and output perturbations which are customizable by the users.
arXiv Detail & Related papers (2024-11-27T20:47:28Z)
Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence. We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context. We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
Privacy-Preserving Language Model Inference with Instance Obfuscation [33.86459812694288]
Language Models as a Service (LM) offers convenient access for developers and researchers to perform inference using pre-trained language models. The input data and the inference results containing private information are exposed as plaintext during the service call, leading to privacy issues. We propose Instance-Obfuscated Inference (IOI) method, which focuses on addressing the decision privacy issue of natural language understanding tasks.
arXiv Detail & Related papers (2024-02-13T05:36:54Z)
Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs. PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z)
Can Language Models be Instructed to Protect Personal Information? [30.187731765653428]
We introduce PrivQA -- a benchmark to assess the privacy/utility trade-off when a model is instructed to protect specific categories of personal information in a simulated scenario. We find that adversaries can easily circumvent these protections with simple jailbreaking methods through textual and/or image inputs. We believe PrivQA has the potential to support the development of new models with improved privacy protections, as well as the adversarial robustness of these protections.
arXiv Detail & Related papers (2023-10-03T17:30:33Z)
How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss. We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z)
Privacy Explanations - A Means to End-User Trust [64.7066037969487]
We looked into how explainability might help to tackle this problem. We created privacy explanations that aim to help to clarify to end users why and for what purposes specific data is required. Our findings reveal that privacy explanations can be an important step towards increasing trust in software systems.
arXiv Detail & Related papers (2022-10-18T09:30:37Z)
Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP) We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy. Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z)
Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL) We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP) We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
Learning With Differential Privacy [3.618133010429131]
Differential privacy comes to the rescue with a proper promise of protection against leakage. It uses a randomized response technique at the time of collection of the data which promises strong privacy with better utility.
arXiv Detail & Related papers (2020-06-10T02:04:13Z)
PrivEdge: From Local to Distributed Private Training and Prediction [43.02041269239928]
PrivEdge is a technique for privacy-preserving Machine Learning (ML) PrivEdge safeguards the privacy of users who provide their data for training, as well as users who use the prediction service. We show that PrivEdge has high precision and recall in preserving privacy, as well as in distinguishing between private and non-private images.
arXiv Detail & Related papers (2020-04-12T09:26:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.