Multi-step Jailbreaking Privacy Attacks on ChatGPT
- URL: http://arxiv.org/abs/2304.05197v3
- Date: Wed, 1 Nov 2023 05:14:01 GMT
- Title: Multi-step Jailbreaking Privacy Attacks on ChatGPT
- Authors: Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang, Fanpu Meng,
Yangqiu Song
- Abstract summary: We study the privacy threats from OpenAI's ChatGPT and the New Bing enhanced by ChatGPT.
We conduct extensive experiments to support our claims and discuss LLMs' privacy implications.
- Score: 47.10284364632862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid progress of large language models (LLMs), many downstream NLP
tasks can be well solved given appropriate prompts. Though model developers and
researchers work hard on dialog safety to avoid generating harmful content from
LLMs, it is still challenging to steer AI-generated content (AIGC) for the
human good. As powerful LLMs are devouring existing text data from various
domains (e.g., GPT-3 is trained on 45TB texts), it is natural to doubt whether
the private information is included in the training data and what privacy
threats can these LLMs and their downstream applications bring. In this paper,
we study the privacy threats from OpenAI's ChatGPT and the New Bing enhanced by
ChatGPT and show that application-integrated LLMs may cause new privacy
threats. To this end, we conduct extensive experiments to support our claims
and discuss LLMs' privacy implications.
Related papers
- PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - LLM-PBE: Assessing Data Privacy in Large Language Models [111.58198436835036]
Large Language Models (LLMs) have become integral to numerous domains, significantly advancing applications in data management, mining, and analysis.
Despite the critical nature of this issue, there has been no existing literature to offer a comprehensive assessment of data privacy risks in LLMs.
Our paper introduces LLM-PBE, a toolkit crafted specifically for the systematic evaluation of data privacy risks in LLMs.
arXiv Detail & Related papers (2024-08-23T01:37:29Z) - On Protecting the Data Privacy of Large Language Models (LLMs): A Survey [35.48984524483533]
Large language models (LLMs) are complex artificial intelligence systems capable of understanding, generating and translating human language.
LLMs process and generate large amounts of data, which may threaten data privacy.
arXiv Detail & Related papers (2024-03-08T08:47:48Z) - The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented
Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data.
In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z) - A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly [21.536079040559517]
Large Language Models (LLMs) have revolutionized natural language understanding and generation.
This paper explores the intersection of LLMs with security and privacy.
arXiv Detail & Related papers (2023-12-04T16:25:18Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - Privacy in Large Language Models: Attacks, Defenses and Future Directions [84.73301039987128]
We analyze the current privacy attacks targeting large language models (LLMs) and categorize them according to the adversary's assumed capabilities.
We present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks.
arXiv Detail & Related papers (2023-10-16T13:23:54Z) - Beyond Memorization: Violating Privacy Via Inference with Large Language Models [2.9373912230684565]
We present the first comprehensive study on the capabilities of pretrained language models to infer personal attributes from text.
Our findings highlight that current LLMs can infer personal data at a previously unattainable scale.
arXiv Detail & Related papers (2023-10-11T08:32:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.