EmojiCrypt: Prompt Encryption for Secure Communication with Large
Language Models
- URL: http://arxiv.org/abs/2402.05868v2
- Date: Mon, 12 Feb 2024 16:26:14 GMT
- Title: EmojiCrypt: Prompt Encryption for Secure Communication with Large
Language Models
- Authors: Guo Lin, Wenyue Hua, Yongfeng Zhang
- Abstract summary: Cloud-based large language models (LLMs) pose substantial risks of data breaches and unauthorized access to sensitive information.
This paper proposes a simple yet effective mechanism EmojiCrypt to protect user privacy.
- Score: 41.090214475309516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloud-based large language models (LLMs) such as ChatGPT have increasingly
become integral to daily operations, serving as vital tools across various
applications. While these models offer substantial benefits in terms of
accessibility and functionality, they also introduce significant privacy
concerns: the transmission and storage of user data in cloud infrastructures
pose substantial risks of data breaches and unauthorized access to sensitive
information; even if the transmission and storage of data is encrypted, the LLM
service provider itself still knows the real contents of the data, preventing
individuals or entities from confidently using such LLM services. To address
these concerns, this paper proposes a simple yet effective mechanism EmojiCrypt
to protect user privacy. It uses Emoji to encrypt the user inputs before
sending them to LLM, effectively rendering them indecipherable to human or
LLM's examination while retaining the original intent of the prompt, thus
ensuring the model's performance remains unaffected. We conduct experiments on
three tasks, personalized recommendation, sentiment analysis, and tabular data
analysis. Experiment results reveal that EmojiCrypt can encrypt personal
information within prompts in such a manner that not only prevents the
discernment of sensitive data by humans or LLM itself, but also maintains or
even improves the precision without further tuning, achieving comparable or
even better task accuracy than directly prompting the LLM without prompt
encryption. These results highlight the practicality of adopting encryption
measures that safeguard user privacy without compromising the functional
integrity and performance of LLMs. Code and dataset are available at
https://github.com/agiresearch/EmojiCrypt.
Related papers
- Hades: Homomorphic Augmented Decryption for Efficient Symbol-comparison -- A Database's Perspective [1.3824176915623292]
This paper introduces HADES, a novel cryptographic framework that enables efficient and secure comparisons on encrypted data.
Based on the Ring Learning with Errors (RLWE) problem, HADES provides CPA-security and incorporates perturbation-aware encryption to mitigate frequency-analysis attacks.
arXiv Detail & Related papers (2024-12-28T02:47:14Z) - Confidential Prompting: Protecting User Prompts from Cloud LLM Providers [0.688204255655161]
Our work tackles the challenge of securing user inputs in cloud-hosted large language model (LLM) serving.
We introduce secure multi-party decoding (SMD), which leverages confidential computing to confine user prompts to a trusted execution environment.
We demonstrate that our approach preserves both prompt confidentiality and LLM serving efficiency.
arXiv Detail & Related papers (2024-09-27T20:32:42Z) - CodeChameleon: Personalized Encryption Framework for Jailbreaking Large
Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics.
We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR)
Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z) - dabih -- encrypted data storage and sharing platform [0.0]
dabih is an open-source web application designed to facilitate user-friendly encrypted data management.
Its approach to data security involves a two-stage envelope encryption process.
The private key necessary for decrypting the data remains exclusively on the owner's device.
arXiv Detail & Related papers (2024-01-16T12:57:35Z) - Simple client-side encryption of personal information with Web Assembly [0.0]
A simple method is proposed to encrypt the data on the client side, using Web Assembly.
The method has been developed for a semantic medical database, and allows accessing personal data using an additional password.
arXiv Detail & Related papers (2023-12-29T17:10:57Z) - Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models [63.91178922306669]
We introduce Silent Guardian, a text protection mechanism against large language models (LLMs)
By carefully modifying the text to be protected, TPE can induce LLMs to first sample the end token, thus directly terminating the interaction.
We show that SG can effectively protect the target text under various configurations and achieve almost 100% protection success rate in some cases.
arXiv Detail & Related papers (2023-12-15T10:30:36Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains.
We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z) - Privacy Implications of Retrieval-Based Language Models [26.87950501433784]
We present the first study of privacy risks in retrieval-based LMs, particularly $k$NN-LMs.
We find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.
arXiv Detail & Related papers (2023-05-24T08:37:27Z) - Reinforcement Learning on Encrypted Data [58.39270571778521]
We present a preliminary, experimental study of how a DQN agent trained on encrypted states performs in environments with discrete and continuous state spaces.
Our results highlight that the agent is still capable of learning in small state spaces even in presence of non-deterministic encryption, but performance collapses in more complex environments.
arXiv Detail & Related papers (2021-09-16T21:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.