Related papers: Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

URL: http://arxiv.org/abs/2601.19936v1
Date: Fri, 16 Jan 2026 07:29:36 GMT
Title: Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data
Authors: Minseo Kwak, Jaehyung Kim,
Abstract summary: Gap-K% is a novel pretraining data detection method grounded in the optimization dynamics of Large Language Models (LLMs)<n>Motivated by this, Gap-K% leverages the log probability gap between the top-1 predicted token and the target token, incorporating a sliding window strategy to capture local correlations and token-level fluctuations.<n>Experiments on the WikiMIA and MIMIR benchmarks demonstrate that Gap-K% achieves state-of-the-art performance.
Score: 6.612630497074871
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The opacity of massive pretraining corpora in Large Language Models (LLMs) raises significant privacy and copyright concerns, making pretraining data detection a critical challenge. Existing state-of-the-art methods typically rely on token likelihoods, yet they often overlook the divergence from the model's top-1 prediction and local correlation between adjacent tokens. In this work, we propose Gap-K%, a novel pretraining data detection method grounded in the optimization dynamics of LLM pretraining. By analyzing the next-token prediction objective, we observe that discrepancies between the model's top-1 prediction and the target token induce strong gradient signals, which are explicitly penalized during training. Motivated by this, Gap-K% leverages the log probability gap between the top-1 predicted token and the target token, incorporating a sliding window strategy to capture local correlations and mitigate token-level fluctuations. Extensive experiments on the WikiMIA and MIMIR benchmarks demonstrate that Gap-K% achieves state-of-the-art performance, consistently outperforming prior baselines across various model sizes and input lengths.

Related papers

RLP: Reinforcement as a Pretraining Objective [103.45068938532923]
We present an information-driven reinforcement pretraining objective that brings the core spirit of reinforcement learning -- exploration -- to the last phase of pretraining.<n>This training objective essentially encourages the model to think for itself before predicting what comes next, thus teaching an independent thinking behavior earlier in the pretraining.<n> Specifically, RLP reframes reinforcement learning for reasoning as a pretraining objective on ordinary text, bridging the gap between next-token prediction and the emergence of useful chain-of-thought reasoning.
arXiv Detail & Related papers (2025-09-26T17:53:54Z)
Improving Prediction Certainty Estimation for Reliable Early Exiting via Null Space Projection [16.838728310658105]
We propose a novel early exiting method based on the Certainty-Aware Probability (CAP) score.<n>We show that our method can achieve an average speed-up ratio of 2.19x across all tasks with negligible performance degradation.
arXiv Detail & Related papers (2025-06-08T05:08:34Z)
Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection [37.65064631532493]
Finetuning a pretrained model to perform unsupervised prediction on data from a target domain presents two challenges.<n>We measure the efficiency of injecting pretraining data into the finetuning data mixture to avoid forgetting and mitigate overfitting.<n>A key practical takeaway from our study is that injecting as little as 1% of pretraining data in the finetuning data mixture prevents the model from forgetting the pretraining set.
arXiv Detail & Related papers (2025-02-09T21:44:27Z)
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method [108.56493934296687]
We introduce a divergence-based calibration method, inspired by the divergence-from-randomness concept, to calibrate token probabilities for pretraining data detection.<n>We have developed a Chinese-language benchmark, PatentMIA, to assess the performance of detection approaches for LLMs on Chinese text.
arXiv Detail & Related papers (2024-09-23T07:55:35Z)
Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens [1.2549198550400134]
Large language models (LLMs) are extensively used, but there are concerns regarding privacy, security, and copyright due to their opaque training data. Current solutions to this problem leverage techniques explored in machine learning privacy such as Membership Inference Attacks (MIAs) We propose an adaptive pre-training data detection method which alleviates this reliance and effectively amplify the identification.
arXiv Detail & Related papers (2024-07-30T23:43:59Z)
Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z)
TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation [65.65530016765615]
We propose a hierarchical predictive coding framework that captures multi-scale dependencies through three complementary learning objectives.<n> TokenUnify integrates random token prediction, next-token prediction, and next-all token prediction to create a comprehensive representational space.<n>We also introduce a large-scale EM dataset with 1.2 billion annotated voxels, offering ideal long-sequence visual data with spatial continuity.
arXiv Detail & Related papers (2024-05-27T05:45:51Z)
Joint Prediction Regions for time-series models [0.0]
It is an easy task to compute Joint Prediction regions (JPR) when the data is IID. This project aims to implement Wolf and Wunderli's method for constructing JPRs and compare it with other methods.
arXiv Detail & Related papers (2024-05-14T02:38:49Z)
Impact of Noisy Supervision in Foundation Model Learning [91.56591923244943]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.<n>We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks [17.243744418309593]
We propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results. We also explore potential strategies for mitigating privacy leakages.
arXiv Detail & Related papers (2023-11-07T10:28:17Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
Patch-level Gaze Distribution Prediction for Gaze Following [49.93340533068501]
We introduce the patch distribution prediction ( PDP) method for gaze following training. We show that our model regularizes the MSE loss by predicting better heatmap distributions on images with larger annotation variances. Experiments show that our model bridging the gap between the target prediction and in/out prediction subtasks, showing a significant improvement on both subtasks on public gaze following datasets.
arXiv Detail & Related papers (2022-11-20T19:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.