Related papers: Privacy Issues in Large Language Models: A Survey

Privacy Issues in Large Language Models: A Survey

URL: http://arxiv.org/abs/2312.06717v4
Date: Thu, 30 May 2024 19:26:05 GMT
Title: Privacy Issues in Large Language Models: A Survey
Authors: Seth Neel, Peter Chang,
Abstract summary: This is the first survey of the active area of AI research that focuses on privacy issues in Large Language Models (LLMs) We focus on work that red-teams models to highlight privacy risks, attempts to build privacy into the training or inference process, and tries to mitigate copyright issues.
Score: 2.707979363409351
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This is the first survey of the active area of AI research that focuses on privacy issues in Large Language Models (LLMs). Specifically, we focus on work that red-teams models to highlight privacy risks, attempts to build privacy into the training or inference process, enables efficient data deletion from trained models to comply with existing privacy regulations, and tries to mitigate copyright issues. Our focus is on summarizing technical research that develops algorithms, proves theorems, and runs empirical evaluations. While there is an extensive body of legal and policy work addressing these challenges from a different angle, that is not the focus of our survey. Nevertheless, these works, along with recent legal developments do inform how these technical problems are formalized, and so we discuss them briefly in Section 1. While we have made our best effort to include all the relevant work, due to the fast moving nature of this research we may have missed some recent work. If we have missed some of your work please contact us, as we will attempt to keep this survey relatively up to date. We are maintaining a repository with the list of papers covered in this survey and any relevant code that was publicly available at https://github.com/safr-ml-lab/survey-llm.

Related papers

Natural Language Processing of Privacy Policies: A Survey [2.4058538793689497]
We conduct a literature review by analyzing 109 papers at the intersection of NLP and privacy policies. We provide a brief introduction to privacy policies and discuss various facets of associated problems. We identify the methodologies that can be further enhanced to provide robust privacy policies.
arXiv Detail & Related papers (2025-01-17T17:47:15Z)
The Good, the Bad, and the (Un)Usable: A Rapid Literature Review on Privacy as Code [4.479352653343731]
Privacy and security are central to the design of information systems endowed with sound data protection and cyber resilience capabilities. Developers often struggle to incorporate these properties into software projects as they either lack proper cybersecurity training or do not consider them a priority.
arXiv Detail & Related papers (2024-12-21T15:30:17Z)
Model Inversion Attacks: A Survey of Approaches and Countermeasures [59.986922963781]
Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training. Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs. This survey aims to summarize up-to-date MIA methods in both attacks and defenses.
arXiv Detail & Related papers (2024-11-15T08:09:28Z)
Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice [32.550204238857724]
We propose a human-centric legal NLP pipeline, covering data sourcing, inference, and evaluation. We release a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law. We show that retrieval-augmented generation from only 850 citations in the train set can match or outperform internet-wide retrieval.
arXiv Detail & Related papers (2024-09-12T02:40:28Z)
Privacy Risks of General-Purpose AI Systems: A Foundation for Investigating Practitioner Perspectives [47.17703009473386]
Powerful AI models have led to impressive leaps in performance across a wide range of tasks. Privacy concerns have led to a wealth of literature covering various privacy risks and vulnerabilities of AI models. We conduct a systematic review of these survey papers to provide a concise and usable overview of privacy risks in GPAIS.
arXiv Detail & Related papers (2024-07-02T07:49:48Z)
How the Future Works at SOUPS: Analyzing Future Work Statements and Their Impact on Usable Security and Privacy Research [9.307988641609834]
We reviewed all 27 papers from the 2019 SOUPS proceedings and analyzed their future work statements. We find that most papers from the SOUPS 2019 proceedings include future work statements. However, they are often unspecific or ambiguous, and not always easy to find. We conclude with recommendations for the usable security and privacy community to improve the utility of future work statements.
arXiv Detail & Related papers (2024-05-30T07:07:18Z)
A Survey of Privacy-Preserving Model Explanations: Privacy Risks, Attacks, and Countermeasures [50.987594546912725]
Despite a growing corpus of research in AI privacy and explainability, there is little attention on privacy-preserving model explanations. This article presents the first thorough survey about privacy attacks on model explanations and their countermeasures.
arXiv Detail & Related papers (2024-03-31T12:44:48Z)
Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP [83.66405397421907]
We rethink the research paradigm of textual adversarial samples in security scenarios. We first collect, process, and release a security datasets collection Advbench. Next, we propose a simple method based on rules that can easily fulfill the actual adversarial goals to simulate real-world attack methods.
arXiv Detail & Related papers (2022-10-19T15:53:36Z)
Yes-Yes-Yes: Donation-based Peer Reviewing Data Collection for ACL Rolling Review and Beyond [58.71736531356398]
We present an in-depth discussion of peer reviewing data, outline the ethical and legal desiderata for peer reviewing data collection, and propose the first continuous, donation-based data collection workflow. We report on the ongoing implementation of this workflow at the ACL Rolling Review and deliver the first insights obtained with the newly collected data.
arXiv Detail & Related papers (2022-01-27T11:02:43Z)
Privacy in Open Search: A Review of Challenges and Solutions [0.6445605125467572]
Information retrieval (IR) is prone to privacy threats, such as attacks and unintended disclosures of documents and search history. This work aims at highlighting and discussing open challenges for privacy in the recent literature of IR, focusing on tasks featuring user-generated text data.
arXiv Detail & Related papers (2021-10-20T18:38:48Z)
PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies. We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.