EmojiCrypt: Prompt Encryption for Secure Communication with Large
Language Models
- URL: http://arxiv.org/abs/2402.05868v2
- Date: Mon, 12 Feb 2024 16:26:14 GMT
- Title: EmojiCrypt: Prompt Encryption for Secure Communication with Large
Language Models
- Authors: Guo Lin, Wenyue Hua, Yongfeng Zhang
- Abstract summary: Cloud-based large language models (LLMs) pose substantial risks of data breaches and unauthorized access to sensitive information.
This paper proposes a simple yet effective mechanism EmojiCrypt to protect user privacy.
- Score: 41.090214475309516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloud-based large language models (LLMs) such as ChatGPT have increasingly
become integral to daily operations, serving as vital tools across various
applications. While these models offer substantial benefits in terms of
accessibility and functionality, they also introduce significant privacy
concerns: the transmission and storage of user data in cloud infrastructures
pose substantial risks of data breaches and unauthorized access to sensitive
information; even if the transmission and storage of data is encrypted, the LLM
service provider itself still knows the real contents of the data, preventing
individuals or entities from confidently using such LLM services. To address
these concerns, this paper proposes a simple yet effective mechanism EmojiCrypt
to protect user privacy. It uses Emoji to encrypt the user inputs before
sending them to LLM, effectively rendering them indecipherable to human or
LLM's examination while retaining the original intent of the prompt, thus
ensuring the model's performance remains unaffected. We conduct experiments on
three tasks, personalized recommendation, sentiment analysis, and tabular data
analysis. Experiment results reveal that EmojiCrypt can encrypt personal
information within prompts in such a manner that not only prevents the
discernment of sensitive data by humans or LLM itself, but also maintains or
even improves the precision without further tuning, achieving comparable or
even better task accuracy than directly prompting the LLM without prompt
encryption. These results highlight the practicality of adopting encryption
measures that safeguard user privacy without compromising the functional
integrity and performance of LLMs. Code and dataset are available at
https://github.com/agiresearch/EmojiCrypt.
Related papers
- Confidential Prompting: Protecting User Prompts from Cloud LLM Providers [0.688204255655161]
We introduce Secure Multi-party Decoding (SMD) to confine user prompts to a trusted execution environment.
We also introduce a novel cryptographic method, Prompt Obfuscation (PO) to ensure robustness against reconstruction attacks.
Our solution can enable privacy-preserving cloud LLM services that handle sensitive prompts, such as clinical records, financial data, and personal information.
arXiv Detail & Related papers (2024-09-27T20:32:42Z) - Ciphertext-Only Attack on a Secure $k$-NN Computation on Cloud [0.0]
encryption can prevent unauthorized access, data breaches, and the resultant financial loss, reputation damage, and legal issues.
Sanyashi et al. proposed an encryption scheme to facilitate privacy-preserving $k$-NN computation on the cloud.
We give an efficient algorithm and empirically demonstrate that their encryption scheme is vulnerable to the ciphertext-only attack (COA)
arXiv Detail & Related papers (2024-03-14T03:53:01Z) - CodeChameleon: Personalized Encryption Framework for Jailbreaking Large
Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics.
We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR)
Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z) - dabih -- encrypted data storage and sharing platform [0.0]
dabih is an open-source web application designed to facilitate user-friendly encrypted data management.
Its approach to data security involves a two-stage envelope encryption process.
The private key necessary for decrypting the data remains exclusively on the owner's device.
arXiv Detail & Related papers (2024-01-16T12:57:35Z) - Simple client-side encryption of personal information with Web Assembly [0.0]
A simple method is proposed to encrypt the data on the client side, using Web Assembly.
The method has been developed for a semantic medical database, and allows accessing personal data using an additional password.
arXiv Detail & Related papers (2023-12-29T17:10:57Z) - Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models [63.91178922306669]
We introduce Silent Guardian, a text protection mechanism against large language models (LLMs)
By carefully modifying the text to be protected, TPE can induce LLMs to first sample the end token, thus directly terminating the interaction.
We show that SG can effectively protect the target text under various configurations and achieve almost 100% protection success rate in some cases.
arXiv Detail & Related papers (2023-12-15T10:30:36Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains.
We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z) - Privacy Implications of Retrieval-Based Language Models [26.87950501433784]
We present the first study of privacy risks in retrieval-based LMs, particularly $k$NN-LMs.
We find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.
arXiv Detail & Related papers (2023-05-24T08:37:27Z) - Reinforcement Learning on Encrypted Data [58.39270571778521]
We present a preliminary, experimental study of how a DQN agent trained on encrypted states performs in environments with discrete and continuous state spaces.
Our results highlight that the agent is still capable of learning in small state spaces even in presence of non-deterministic encryption, but performance collapses in more complex environments.
arXiv Detail & Related papers (2021-09-16T21:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.