Related papers: Many-Shot Regurgitation (MSR) Prompting

Many-Shot Regurgitation (MSR) Prompting

URL: http://arxiv.org/abs/2405.08134v1
Date: Mon, 13 May 2024 19:22:40 GMT
Title: Many-Shot Regurgitation (MSR) Prompting
Authors: Shashank Sonkar, Richard G. Baraniuk,
Abstract summary: We introduce Many-Shot Regurgitation (MSR) prompting, a new black-box membership inference attack framework for examining verbatim content reproduction in large language models (LLMs) MSR prompting involves dividing the input text into multiple segments and creating a single prompt that includes a series of faux conversation rounds between a user and a language model to elicit verbatim regurgitation. We apply MSR prompting to diverse text sources, including Wikipedia articles and open educational resources (OER) textbooks, which provide high-quality, factual content.
Score: 26.9991760335222
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce Many-Shot Regurgitation (MSR) prompting, a new black-box membership inference attack framework for examining verbatim content reproduction in large language models (LLMs). MSR prompting involves dividing the input text into multiple segments and creating a single prompt that includes a series of faux conversation rounds between a user and a language model to elicit verbatim regurgitation. We apply MSR prompting to diverse text sources, including Wikipedia articles and open educational resources (OER) textbooks, which provide high-quality, factual content and are continuously updated over time. For each source, we curate two dataset types: one that LLMs were likely exposed to during training ($D_{\rm pre}$) and another consisting of documents published after the models' training cutoff dates ($D_{\rm post}$). To quantify the occurrence of verbatim matches, we employ the Longest Common Substring algorithm and count the frequency of matches at different length thresholds. We then use statistical measures such as Cliff's delta, Kolmogorov-Smirnov (KS) distance, and Kruskal-Wallis H test to determine whether the distribution of verbatim matches differs significantly between $D_{\rm pre}$ and $D_{\rm post}$. Our findings reveal a striking difference in the distribution of verbatim matches between $D_{\rm pre}$ and $D_{\rm post}$, with the frequency of verbatim reproduction being significantly higher when LLMs (e.g. GPT models and LLaMAs) are prompted with text from datasets they were likely trained on. For instance, when using GPT-3.5 on Wikipedia articles, we observe a substantial effect size (Cliff's delta $= -0.984$) and a large KS distance ($0.875$) between the distributions of $D_{\rm pre}$ and $D_{\rm post}$. Our results provide compelling evidence that LLMs are more prone to reproducing verbatim content when the input text is likely sourced from their training data.

Related papers

QUDsim: Quantifying Discourse Similarities in LLM-Generated Text [70.22275200293964]
We introduce an abstraction based on linguistic theories in Questions Under Discussion (QUD) and question semantics to help quantify differences in discourse progression. We then use this framework to build $textbfQUDsim$, a similarity metric that can detect discursive parallels between documents. Using QUDsim, we find that LLMs often reuse discourse structures (more so than humans) across samples, even when content differs.
arXiv Detail & Related papers (2025-04-12T23:46:09Z)
Language Models May Verbatim Complete Text They Were Not Explicitly Trained On [97.3414396208613]
We show that a $n$-gram based membership definition can be effectively gamed. We show that it is difficult to find a single viable choice of $n$ for membership definitions. Our findings highlight the inadequacy of $n$-gram membership, suggesting membership definitions fail to account for auxiliary information.
arXiv Detail & Related papers (2025-03-21T19:57:04Z)
Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs) We find that fine-tuning existing text embedding models on LLM-generated texts yields excellent classification accuracy. We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z)
Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities [13.657259851747126]
Verifying the provenance of content is crucial to the function of many organizations, e.g., educational institutions, social media platforms, firms, etc. This problem is becoming increasingly challenging as text generated by Large Language Models (LLMs) becomes almost indistinguishable from human-generated content. We show that our tests' type I and type II errors decrease exponentially as text length increases. Practically, our work enables guaranteed finding of the origin of harmful or false LLM-generated text, which can be useful for combating misinformation and compliance with emerging AI regulations.
arXiv Detail & Related papers (2025-01-04T23:51:43Z)
Reasoning Robustness of LLMs to Adversarial Typographical Errors [49.99118660264703]
Large Language Models (LLMs) have demonstrated impressive capabilities in reasoning using Chain-of-Thought (CoT) prompting. We study the reasoning robustness of LLMs to typographical errors, which can naturally occur in users' queries. We design an Adversarial Typo Attack ($texttATA$) algorithm that iteratively samples typos for words that are important to the query and selects the edit that is most likely to succeed in attacking.
arXiv Detail & Related papers (2024-11-08T05:54:05Z)
A Mixed-Language Multi-Document News Summarization Dataset and a Graphs-Based Extract-Generate Model [15.596156608713347]
In real-world scenarios, news about an international event often involves multiple documents in different languages. We construct a mixed-language multi-document news summarization dataset (MLMD-news) This dataset contains four different languages and 10,992 source document cluster and target summary pairs.
arXiv Detail & Related papers (2024-10-13T08:15:33Z)
LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models [73.13933847198395]
We propose a training-free framework for processing long texts, utilizing a divide-and-conquer strategy to achieve comprehensive document understanding. The proposed LLM$times$MapReduce framework splits the entire document into several chunks for LLMs to read and then aggregates the intermediate answers to produce the final output.
arXiv Detail & Related papers (2024-10-12T03:13:44Z)
Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG [57.14250086701313]
We investigate the extent to which modern LMs generate $n$-grams from their training data. We develop Rusty-DAWG, a novel search tool inspired by indexing of genomic data.
arXiv Detail & Related papers (2024-06-18T21:31:19Z)
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution [87.3259169631789]
Nearest Speculative Decoding (NEST) is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources.<n>NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks.<n>In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.
arXiv Detail & Related papers (2024-05-29T17:55:03Z)
Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study [44.39031420687302]
Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. We try to understand this by designing a benchmark to evaluate the structural understanding capabilities of LLMs. We propose $textitself-augmentation$ for effective structural prompting, such as critical value / range identification.
arXiv Detail & Related papers (2023-05-22T14:23:46Z)
You can't pick your neighbors, or can you? When and how to rely on retrieval in the $k$NN-LM [65.74934004876914]
Retrieval-enhanced language models (LMs) condition their predictions on text retrieved from large external datastores. One such approach, the $k$NN-LM, interpolates any existing LM's predictions with the output of a $k$-nearest neighbors model. We empirically measure the effectiveness of our approach on two English language modeling datasets.
arXiv Detail & Related papers (2022-10-28T02:57:40Z)
Reward-Mixing MDPs with a Few Latent Contexts are Learnable [75.17357040707347]
We consider episodic reinforcement learning in reward-mixing Markov decision processes (RMMDPs) Our goal is to learn a near-optimal policy that nearly maximizes the $H$ time-step cumulative rewards in such a model.
arXiv Detail & Related papers (2022-10-05T22:52:00Z)
Blessing of Class Diversity in Pre-training [54.335530406959435]
We prove that when the classes of the pre-training task are sufficiently diverse, pre-training can significantly improve the sample efficiency of downstream tasks. Our proof relies on a vector-form Rademacher complexity chain rule for composite function classes and a modified self-concordance condition.
arXiv Detail & Related papers (2022-09-07T20:10:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.