Protecting Copyrighted Material with Unique Identifiers in Large Language Model Training
- URL: http://arxiv.org/abs/2403.15740v2
- Date: Mon, 12 Aug 2024 08:21:32 GMT
- Title: Protecting Copyrighted Material with Unique Identifiers in Large Language Model Training
- Authors: Shuai Zhao, Linchao Zhu, Ruijie Quan, Yi Yang,
- Abstract summary: A major public concern regarding the training of large language models (LLMs) is whether they abusing copyrighted online text.
Previous membership inference methods may be misled by similar examples in vast amounts of training data.
We propose an alternative textitinsert-and-detection methodology, advocating that web users and content platforms employ textbftextitunique identifiers.
- Score: 55.321010757641524
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A major public concern regarding the training of large language models (LLMs) is whether they abusing copyrighted online text. Previous membership inference methods may be misled by similar examples in vast amounts of training data. Additionally, these methods are often too complex for general users to understand and use, making them centralized, lacking transparency, and trustworthiness. To address these issues, we propose an alternative \textit{insert-and-detection} methodology, advocating that web users and content platforms employ \textbf{\textit{unique identifiers}} for reliable and independent membership inference. Users and platforms can create their own identifiers, embed them in copyrighted text, and independently detect them in future LLMs. As an initial demonstration, we introduce \textit{ghost sentences}, a primitive form of unique identifiers, consisting primarily of passphrases made up of random words. By embedding one ghost sentences in a few copyrighted texts, users can detect its membership using a perplexity test and a \textit{user-friendly} last-$k$ words test. The perplexity test is based on the fact that LLMs trained on natural language should exhibit high perplexity when encountering unnatural passphrases. As the repetition increases, users can leverage the verbatim memorization ability of LLMs to perform a last-$k$ words test by chatting with LLMs without writing any code. Both tests offer rigorous statistical guarantees for membership inference. For LLaMA-13B, a perplexity test on 30 ghost sentences with an average of 7 repetitions in 148K examples yields a 0.891 ROC AUC. For the last-$k$ words test with OpenLLaMA-3B, 11 out of 16 users, with an average of 24 examples each, successfully identify their data from 1.8M examples.
Related papers
- A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution [57.309390098903]
Authorship attribution aims to identify the origin or author of a document.
Large Language Models (LLMs) with their deep reasoning capabilities and ability to maintain long-range textual associations offer a promising alternative.
Our results on the IMDb and blog datasets show an impressive 85% accuracy in one-shot authorship classification across ten authors.
arXiv Detail & Related papers (2024-10-29T04:14:23Z) - Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs [0.41436032949434404]
We develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets.
We show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue.
arXiv Detail & Related papers (2024-08-19T07:14:15Z) - DE-COP: Detecting Copyrighted Content in Language Models Training Data [24.15936677068714]
We propose DE-COP, a method to determine whether a piece of copyrighted content was included in training.
We construct BookTection, a benchmark with excerpts from 165 books published prior and subsequent to a model's training cutoff.
Experiments show that DE-COP surpasses the prior best method by 9.6% in detection performance.
arXiv Detail & Related papers (2024-02-15T12:17:15Z) - AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language
Models Denoising [4.924903495092775]
Large language models (LLMs) create text that closely mimics human writing, which can lead to potential misuse.
We present AuthentiGPT, an efficient classifier that distinguishes between machine-generated and human-written texts.
With a 0.918 AUROC score on a domain-specific dataset, AuthentiGPT demonstrates its effectiveness over other commercial algorithms.
arXiv Detail & Related papers (2023-11-13T19:36:54Z) - SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs)
We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for
Generative Large Language Models [55.60306377044225]
"SelfCheckGPT" is a simple sampling-based approach to fact-check the responses of black-box models.
We investigate this approach by using GPT-3 to generate passages about individuals from the WikiBio dataset.
arXiv Detail & Related papers (2023-03-15T19:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.