Related papers: RECAP: Reproducing Copyrighted Data from LLMs Training with an Agentic Pipeline

RECAP: Reproducing Copyrighted Data from LLMs Training with an Agentic Pipeline

URL: http://arxiv.org/abs/2510.25941v1
Date: Wed, 29 Oct 2025 20:36:37 GMT
Title: RECAP: Reproducing Copyrighted Data from LLMs Training with an Agentic Pipeline
Authors: André V. Duarte, Xuying li, Bin Zeng, Arlindo L. Oliveira, Lei Li, Zhuo Li,
Abstract summary: We propose RECAP, an agentic pipeline designed to elicit and verify memorized training data from large language models.<n>At the heart of RECAP is a feedback-driven loop, where an initial extraction attempt is evaluated by a secondary language model.<n>We evaluate RECAP on EchoTrace, a new benchmark spanning over 30 full books, and the results show that RECAP leads to substantial gains over single-iteration approaches.
Score: 9.49236542025774
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: If we cannot inspect the training data of a large language model (LLM), how can we ever know what it has seen? We believe the most compelling evidence arises when the model itself freely reproduces the target content. As such, we propose RECAP, an agentic pipeline designed to elicit and verify memorized training data from LLM outputs. At the heart of RECAP is a feedback-driven loop, where an initial extraction attempt is evaluated by a secondary language model, which compares the output against a reference passage and identifies discrepancies. These are then translated into minimal correction hints, which are fed back into the target model to guide subsequent generations. In addition, to address alignment-induced refusals, RECAP includes a jailbreaking module that detects and overcomes such barriers. We evaluate RECAP on EchoTrace, a new benchmark spanning over 30 full books, and the results show that RECAP leads to substantial gains over single-iteration approaches. For instance, with GPT-4.1, the average ROUGE-L score for the copyrighted text extraction improved from 0.38 to 0.47 - a nearly 24% increase.

Related papers

Learning to Detect Language Model Training Data via Active Reconstruction [65.4791582049743]
We introduce textbfActive Data Reconstruction Attack (ADRA)<n>ADRA induces a model to reconstruct a given text through training.<n>Our algorithms consistently outperform existing MIAs in detecting pre-training, post-training, and distillation data.
arXiv Detail & Related papers (2026-02-22T03:20:06Z)
Small Reward Models via Backward Inference [100.59075794599768]
FLIP (FLipped Inference for Prompt Reconstruction) is a reference-free and rubric-free reward modeling approach.<n>It reformulates reward modeling through backward inference: inferring the instruction that would most plausibly produce a given response.
arXiv Detail & Related papers (2026-02-14T01:55:39Z)
Reinforcement Learning via Self-Distillation [37.078107691613155]
Large language models are increasingly post-trained with reinforcement learning in verifiable domains such as code and math.<n>Current methods for reinforcement learning with verifiable rewards (RLVR) learn only from a scalar outcome reward per attempt, creating a severe credit-assignment bottleneck.<n>We formalize this setting as reinforcement learning with rich feedback and introduce Self-Distillation Policy Optimization (SDPO)<n>SDPO converts tokenized feedback into a dense learning signal without any external teacher or explicit reward model.
arXiv Detail & Related papers (2026-01-28T17:45:12Z)
ConCISE: A Reference-Free Conciseness Evaluation Metric for LLM-Generated Answers [0.3431096786139341]
We introduce a novel reference-free metric for evaluating the conciseness of responses generated by large language models.<n>Our method quantifies non-essential content without relying on gold standard references.
arXiv Detail & Related papers (2025-11-20T23:03:23Z)
LANPO: Bootstrapping Language and Numerical Feedback for Reinforcement Learning in LLMs [73.27182315028021]
LANPO is a framework that cleanly separates the roles of feedback: language guides exploration, while numerical rewards drive optimization.<n>Our work provides a robust method for integrating historical experiences into the LLM RL loop, creating more effective and data-efficient learning agents.
arXiv Detail & Related papers (2025-10-18T15:51:19Z)
AIC CTU system at AVeriTeC: Re-framing automated fact-checking as a simple RAG task [0.0]
This paper describes our solution to the challenge of fact-checking with evidence retrieved in the wild using a simple scheme of Retrieval-Augmented Generation (RAG) We release our and explain its two modules - the Retriever and the Evidence & Label generator - in detail, justifying their features such as MMR-reranking and Likert-scale confidence estimation. We perform an empirical error analysis to see that faults in our predictions often coincide with noise in the data or ambiguous fact-checks, provoking further research and data augmentation.
arXiv Detail & Related papers (2024-10-15T09:50:19Z)
RaFe: Ranking Feedback Improves Query Rewriting for RAG [83.24385658573198]
We propose a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, oursprovides feedback aligned well with the rewriting objectives.
arXiv Detail & Related papers (2024-05-23T11:00:19Z)
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs [61.04246774006429]
We introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent.<n>We observe that our instruction-based prompts generate outputs with 23.7% higher overlap with training data compared to the baseline prefix-suffix measurements.<n>Our findings show that instruction-tuned models can expose pre-training data as much as their base-models, if not more so, and using instructions proposed by other LLMs can open a new avenue of automated attacks.
arXiv Detail & Related papers (2024-03-05T19:32:01Z)
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems [22.142588104314175]
We study the risk of datastore leakage in Retrieval-In-Context RAG Language Models (LMs) We show that an adversary can exploit LMs' instruction-following capabilities to easily extract text data verbatim from the datastore. We design an attack that can cause datastore leakage with a 100% success rate on 25 randomly selected customized GPTs with at most 2 queries.
arXiv Detail & Related papers (2024-02-27T19:08:05Z)
RDR: the Recap, Deliberate, and Respond Method for Enhanced Language Understanding [6.738409533239947]
The Recap, Deliberate, and Respond (RDR) paradigm addresses this issue by incorporating three distinct objectives within the neural network pipeline. By cascading these three models, we mitigate the potential for gaming the benchmark and establish a robust method for capturing the underlying semantic patterns. Our results demonstrate improved performance compared to competitive baselines, with an enhancement of up to 2% on standard metrics.
arXiv Detail & Related papers (2023-12-15T16:41:48Z)
Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots [68.84056762301329]
Recent research has exposed the susceptibility of pretrained language models (PLMs) to backdoor attacks. We propose and integrate a honeypot module into the original PLM to absorb backdoor information exclusively. Our design is motivated by the observation that lower-layer representations in PLMs carry sufficient backdoor features.
arXiv Detail & Related papers (2023-10-28T08:21:16Z)
Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study [115.96080028033904]
We study a scalable pre-trained retrieval-augmented LM (i.e., RETRO) compared with standard GPT and retrieval-augmented GPT. Our findings highlight the promising direction of pretraining autoregressive LMs with retrieval as future foundation models.
arXiv Detail & Related papers (2023-04-13T18:04:19Z)
Robust Spoken Language Understanding with RL-based Value Error Recovery [35.82890898452309]
Spoken Language Understanding (SLU) aims to extract structured semantic representations (e.g., slot-value pairs) from speech recognized texts. We propose a new robust SLU framework to guide the SLU input adaptation with a rule-based value error recovery module. Experiments on the public CATSLU dataset show the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-09-07T13:32:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.