Related papers: Ghost Sentence: A Tool for Everyday Users to Copyright Data from Large Language Models

Ghost Sentence: A Tool for Everyday Users to Copyright Data from Large Language Models

URL: http://arxiv.org/abs/2403.15740v1
Date: Sat, 23 Mar 2024 06:36:32 GMT
Title: Ghost Sentence: A Tool for Everyday Users to Copyright Data from Large Language Models
Authors: Shuai Zhao, Linchao Zhu, Ruijie Quan, Yi Yang,
Abstract summary: Web user data plays a central role in the ecosystem of pre-trained large language models (LLMs) In this work, we suggest that users repeatedly insert personal passphrases into their documents. Once they are identified in the generated content of LLMs, users can be sure that their data is used for training.
Score: 55.321010757641524
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Web user data plays a central role in the ecosystem of pre-trained large language models (LLMs) and their fine-tuned variants. Billions of data are crawled from the web and fed to LLMs. How can \textit{\textbf{everyday web users}} confirm if LLMs misuse their data without permission? In this work, we suggest that users repeatedly insert personal passphrases into their documents, enabling LLMs to memorize them. These concealed passphrases in user documents, referred to as \textit{ghost sentences}, once they are identified in the generated content of LLMs, users can be sure that their data is used for training. To explore the effectiveness and usage of this copyrighting tool, we define the \textit{user training data identification} task with ghost sentences. Multiple datasets from various sources at different scales are created and tested with LLMs of different sizes. For evaluation, we introduce a last $k$ words verification manner along with two metrics: document and user identification accuracy. In the specific case of instruction tuning of a 3B LLaMA model, 11 out of 16 users with ghost sentences identify their data within the generation content. These 16 users contribute 383 examples to $\sim$1.8M training documents. For continuing pre-training of a 1.1B TinyLlama model, 61 out of 64 users with ghost sentences identify their data within the LLM output. These 64 users contribute 1156 examples to $\sim$10M training documents.

Related papers

Hallucination Detection with Small Language Models [1.9181612035055007]
This paper proposes a framework that integrates multiple small language models to verify responses generated by large language models.<n>The results demonstrate a 10% improvement in F1 scores for detecting correct responses compared to hallucinations.
arXiv Detail & Related papers (2025-06-24T02:19:26Z)
Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models [53.17363502535395]
Trustworthy language models should provide both correct and verifiable answers.<n>Current systems insert citations by querying an external retriever at inference time.<n>We propose Active Indexing, which continually pretrains on synthetic QA pairs.
arXiv Detail & Related papers (2025-06-21T04:48:05Z)
Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors [65.27124213266491]
We propose textbfContrastive textbfParaphrase textbfAttack (CoPA), a training-free method that effectively deceives text detectors.<n>CoPA constructs an auxiliary machine-like word distribution as a contrast to the human-like distribution generated by large language models.<n>Our theoretical analysis suggests the superiority of the proposed attack.
arXiv Detail & Related papers (2025-05-21T10:08:39Z)
ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability [62.285407189502216]
Detecting texts generated by Large Language Models (LLMs) could cause grave mistakes due to incorrect decisions. We introduce ExaGPT, an interpretable detection approach grounded in the human decision-making process. We show that ExaGPT massively outperforms prior powerful detectors by up to +40.9 points of accuracy at a false positive rate of 1%.
arXiv Detail & Related papers (2025-02-17T01:15:07Z)
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning [44.84219266082269]
Large Language Models (LLMs) excel at reasoning and planning when trained on chainof-thought (CoT) data.<n>We propose a hybrid representation of the reasoning process, where we partially abstract away the initial reasoning steps using latent discrete tokens.
arXiv Detail & Related papers (2025-02-05T15:33:00Z)
AIDBench: A benchmark for evaluating the authorship identification capability of large language models [14.866356328321126]
We focus on a specific privacy risk where large language models (LLMs) may help identify the authorship of anonymous texts. We present AIDBench, a new benchmark that incorporates several author identification datasets, including emails, blogs, reviews, articles, and research papers. Our experiments with AIDBench demonstrate that LLMs can correctly guess authorship at rates well above random chance, revealing new privacy risks posed by these powerful models.
arXiv Detail & Related papers (2024-11-20T11:41:08Z)
A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution [57.309390098903]
Authorship attribution aims to identify the origin or author of a document. Large Language Models (LLMs) with their deep reasoning capabilities and ability to maintain long-range textual associations offer a promising alternative. Our results on the IMDb and blog datasets show an impressive 85% accuracy in one-shot authorship classification across ten authors.
arXiv Detail & Related papers (2024-10-29T04:14:23Z)
Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs [0.41436032949434404]
We develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets. We show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue.
arXiv Detail & Related papers (2024-08-19T07:14:15Z)
DE-COP: Detecting Copyrighted Content in Language Models Training Data [24.15936677068714]
We propose DE-COP, a method to determine whether a piece of copyrighted content was included in training. We construct BookTection, a benchmark with excerpts from 165 books published prior and subsequent to a model's training cutoff. Experiments show that DE-COP surpasses the prior best method by 9.6% in detection performance.
arXiv Detail & Related papers (2024-02-15T12:17:15Z)
AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising [4.924903495092775]
Large language models (LLMs) create text that closely mimics human writing, which can lead to potential misuse. We present AuthentiGPT, an efficient classifier that distinguishes between machine-generated and human-written texts. With a 0.918 AUROC score on a domain-specific dataset, AuthentiGPT demonstrates its effectiveness over other commercial algorithms.
arXiv Detail & Related papers (2023-11-13T19:36:54Z)
SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs) We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z)
Detecting Language Model Attacks with Perplexity [0.0]
A novel hack involving Large Language Models (LLMs) has emerged, exploiting adversarial suffixes to deceive models into generating perilous responses. A Light-GBM trained on perplexity and token length resolved the false positives and correctly detected most adversarial attacks in the test set.
arXiv Detail & Related papers (2023-08-27T15:20:06Z)
Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models [7.782346535009883]
This work introduces approaches to assessing phrase breaks in ESL learners' speech using pre-trained language models (PLMs) and large language models (LLMs)
arXiv Detail & Related papers (2023-06-08T07:10:39Z)
MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection. We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs. Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z)
Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response. We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English. Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z)
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models [55.60306377044225]
"SelfCheckGPT" is a simple sampling-based approach to fact-check the responses of black-box models. We investigate this approach by using GPT-3 to generate passages about individuals from the WikiBio dataset.
arXiv Detail & Related papers (2023-03-15T19:31:21Z)
Contextual Multi-View Query Learning for Short Text Classification in User-Generated Data [6.052423212814052]
COCOBA employs the context of user postings to construct two views. It then uses the distribution of the representations in each view to detect the regions that are assigned to the opposite classes. Our model also employs a query-by-committee model to address the usually noisy language of user postings.
arXiv Detail & Related papers (2021-12-05T16:17:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.