EmojiCrypt: Prompt Encryption for Secure Communication with Large
Language Models
- URL: http://arxiv.org/abs/2402.05868v2
- Date: Mon, 12 Feb 2024 16:26:14 GMT
- Title: EmojiCrypt: Prompt Encryption for Secure Communication with Large
Language Models
- Authors: Guo Lin, Wenyue Hua, Yongfeng Zhang
- Abstract summary: Cloud-based large language models (LLMs) pose substantial risks of data breaches and unauthorized access to sensitive information.
This paper proposes a simple yet effective mechanism EmojiCrypt to protect user privacy.
- Score: 41.090214475309516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloud-based large language models (LLMs) such as ChatGPT have increasingly
become integral to daily operations, serving as vital tools across various
applications. While these models offer substantial benefits in terms of
accessibility and functionality, they also introduce significant privacy
concerns: the transmission and storage of user data in cloud infrastructures
pose substantial risks of data breaches and unauthorized access to sensitive
information; even if the transmission and storage of data is encrypted, the LLM
service provider itself still knows the real contents of the data, preventing
individuals or entities from confidently using such LLM services. To address
these concerns, this paper proposes a simple yet effective mechanism EmojiCrypt
to protect user privacy. It uses Emoji to encrypt the user inputs before
sending them to LLM, effectively rendering them indecipherable to human or
LLM's examination while retaining the original intent of the prompt, thus
ensuring the model's performance remains unaffected. We conduct experiments on
three tasks, personalized recommendation, sentiment analysis, and tabular data
analysis. Experiment results reveal that EmojiCrypt can encrypt personal
information within prompts in such a manner that not only prevents the
discernment of sensitive data by humans or LLM itself, but also maintains or
even improves the precision without further tuning, achieving comparable or
even better task accuracy than directly prompting the LLM without prompt
encryption. These results highlight the practicality of adopting encryption
measures that safeguard user privacy without compromising the functional
integrity and performance of LLMs. Code and dataset are available at
https://github.com/agiresearch/EmojiCrypt.
Related papers
- A General Pseudonymization Framework for Cloud-Based LLMs: Replacing Privacy Information in Controlled Text Generation [0.6699777383856287]
ChatGPT services leverage cloud-based large language models (LLMs)
Privacy concerns arise as prompts are transmitted and processed by the model providers.
We propose a general pseudonymization framework applicable to cloud-based LLMs.
arXiv Detail & Related papers (2025-02-21T06:15:53Z) - PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models [10.050972891318324]
We propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and large language models.
We construct SensitiveQA, the first privacy open-ended question-answering dataset.
Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs.
arXiv Detail & Related papers (2025-02-19T09:17:07Z) - Confidential Prompting: Protecting User Prompts from Cloud LLM Providers [0.688204255655161]
We introduce Secure Multi-party Decoding (SMD) to confine user prompts to a trusted execution environment.
We also introduce a novel cryptographic method, Prompt Obfuscation (PO) to ensure robustness against reconstruction attacks.
Our solution can enable privacy-preserving cloud LLM services that handle sensitive prompts, such as clinical records, financial data, and personal information.
arXiv Detail & Related papers (2024-09-27T20:32:42Z) - Ciphertext-Only Attack on a Secure $k$-NN Computation on Cloud [0.0]
encryption can prevent unauthorized access, data breaches, and the resultant financial loss, reputation damage, and legal issues.
Sanyashi et al. proposed an encryption scheme to facilitate privacy-preserving $k$-NN computation on the cloud.
We give an efficient algorithm and empirically demonstrate that their encryption scheme is vulnerable to the ciphertext-only attack (COA)
arXiv Detail & Related papers (2024-03-14T03:53:01Z) - Differentially Private Synthetic Data via Foundation Model APIs 2: Text [56.13240830670327]
A lot of high-quality text data generated in the real world is private and cannot be shared or used freely due to privacy concerns.
We propose an augmented PE algorithm, named Aug-PE, that applies to the complex setting of text.
Our results demonstrate that Aug-PE produces DP synthetic text that yields competitive utility with the SOTA DP finetuning baselines.
arXiv Detail & Related papers (2024-03-04T05:57:50Z) - CodeChameleon: Personalized Encryption Framework for Jailbreaking Large
Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics.
We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR)
Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z) - dabih -- encrypted data storage and sharing platform [0.0]
dabih is an open-source web application designed to facilitate user-friendly encrypted data management.
Its approach to data security involves a two-stage envelope encryption process.
The private key necessary for decrypting the data remains exclusively on the owner's device.
arXiv Detail & Related papers (2024-01-16T12:57:35Z) - ConfusionPrompt: Practical Private Inference for Online Large Language Models [3.8134804426693094]
State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers.
We introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts.
We show that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques.
arXiv Detail & Related papers (2023-12-30T01:26:42Z) - Simple client-side encryption of personal information with Web Assembly [0.0]
A simple method is proposed to encrypt the data on the client side, using Web Assembly.
The method has been developed for a semantic medical database, and allows accessing personal data using an additional password.
arXiv Detail & Related papers (2023-12-29T17:10:57Z) - Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models [63.91178922306669]
We introduce Silent Guardian, a text protection mechanism against large language models (LLMs)
By carefully modifying the text to be protected, TPE can induce LLMs to first sample the end token, thus directly terminating the interaction.
We show that SG can effectively protect the target text under various configurations and achieve almost 100% protection success rate in some cases.
arXiv Detail & Related papers (2023-12-15T10:30:36Z) - DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer [57.04801796205638]
Large Language Models (LLMs) have emerged as dominant tools for various tasks.
However, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information.
We present Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge.
arXiv Detail & Related papers (2023-11-27T02:01:10Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy
Protection [6.201275002179716]
We introduce the HaS framework, where "H(ide)" and "S(eek)" represent its two core processes: hiding private entities for anonymization and seeking private entities for de-anonymization.
To quantitatively assess HaS's privacy protection performance, we propose both black-box and white-box adversarial models.
arXiv Detail & Related papers (2023-09-06T14:54:11Z) - GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains.
We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z) - Privacy Implications of Retrieval-Based Language Models [26.87950501433784]
We present the first study of privacy risks in retrieval-based LMs, particularly $k$NN-LMs.
We find that $k$NN-LMs are more susceptible to leaking private information from their private datastore than parametric models.
arXiv Detail & Related papers (2023-05-24T08:37:27Z) - THE-X: Privacy-Preserving Transformer Inference with Homomorphic
Encryption [112.02441503951297]
Privacy-preserving inference of transformer models is on the demand of cloud service users.
We introduce $textitTHE-X$, an approximation approach for transformers, which enables privacy-preserving inference of pre-trained models.
arXiv Detail & Related papers (2022-06-01T03:49:18Z) - Reinforcement Learning on Encrypted Data [58.39270571778521]
We present a preliminary, experimental study of how a DQN agent trained on encrypted states performs in environments with discrete and continuous state spaces.
Our results highlight that the agent is still capable of learning in small state spaces even in presence of non-deterministic encryption, but performance collapses in more complex environments.
arXiv Detail & Related papers (2021-09-16T21:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.