Related papers: Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack

Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack

URL: http://arxiv.org/abs/2501.08454v1
Date: Tue, 14 Jan 2025 21:55:37 GMT
Title: Tag&Tab: Pretraining Data Detection in Large Language Models Using Keyword-Based Membership Inference Attack
Authors: Sagiv Antebi, Edan Habler, Asaf Shabtai, Yuval Elovici,
Abstract summary: Large language models (LLMs) have become essential digital task assistance tools.<n>Recent studies on the detection of pretraining data in LLMs have primarily focused on sentence-level or paragraph-level membership inference attacks.<n>We propose Tag&Tab, a novel approach for detecting data that has been used as part of the LLM pretraining.
Score: 26.083244046813512
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have become essential digital task assistance tools. Their training relies heavily on the collection of vast amounts of data, which may include copyright-protected or sensitive information. Recent studies on the detection of pretraining data in LLMs have primarily focused on sentence-level or paragraph-level membership inference attacks (MIAs), usually involving probability analysis of the target model prediction tokens. However, the proposed methods often demonstrate poor performance, specifically in terms of accuracy, failing to account for the semantic importance of textual content and word significance. To address these shortcomings, we propose Tag&Tab, a novel approach for detecting data that has been used as part of the LLM pretraining. Our method leverages advanced natural language processing (NLP) techniques to tag keywords in the input text - a process we term Tagging. Then, the LLM is used to obtain the probabilities of these keywords and calculate their average log-likelihood to determine input text membership, a process we refer to as Tabbing. Our experiments on three benchmark datasets (BookMIA, MIMIR, and the Pile) and several open-source LLMs of varying sizes demonstrate an average increase in the AUC scores ranging from 4.1% to 12.1% over state-of-the-art methods. Tag&Tab not only sets a new standard for data leakage detection in LLMs, but its outstanding performance is a testament to the importance of words in MIAs on LLMs.

Related papers

Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models [52.439289085318634]
We show how to identify training data known to proprietary large language models (LLMs) by using information-guided probes. Our work builds on a key observation: text passages with high surprisal are good search material for memorization probes.
arXiv Detail & Related papers (2025-03-15T10:19:15Z)
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models [34.39913818362284]
Membership Inference Attacks (MIAs) aim to predict whether a data sample belongs to the model's training set or not. We propose textbfPETAL: a label-only membership inference attack based on textbfPEr-textbfToken semtextbfAntic simitextbfLL.
arXiv Detail & Related papers (2025-02-26T08:47:19Z)
Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models [73.94175015918059]
We propose a dataset-level membership inference method based on Self-Comparison. Our method does not require access to ground-truth member data or non-member data in identical distribution.
arXiv Detail & Related papers (2024-10-16T23:05:59Z)
Detecting Training Data of Large Language Models via Expectation Maximization [62.28028046993391]
Membership inference attacks (MIAs) aim to determine whether a specific instance was part of a target model's training data. Applying MIAs to large language models (LLMs) presents unique challenges due to the massive scale of pre-training data and the ambiguous nature of membership. We introduce EM-MIA, a novel MIA method for LLMs that iteratively refines membership scores and prefix scores via an expectation-maximization algorithm.
arXiv Detail & Related papers (2024-10-10T03:31:16Z)
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method [108.56493934296687]
We introduce a divergence-based calibration method, inspired by the divergence-from-randomness concept, to calibrate token probabilities for pretraining data detection. We have developed a Chinese-language benchmark, PatentMIA, to assess the performance of detection approaches for LLMs on Chinese text.
arXiv Detail & Related papers (2024-09-23T07:55:35Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore [51.65730053591696]
We propose a simple yet effective black-box zero-shot detection approach based on the observation that human-written texts typically contain more grammatical errors than LLM-generated texts. Experimental results show that our method outperforms current state-of-the-art (SOTA) zero-shot and supervised methods.
arXiv Detail & Related papers (2024-05-07T12:57:01Z)
Sampling-based Pseudo-Likelihood for Membership Inference Attacks [36.62066767969338]
Membership Inference Attacks (MIAs) determine whether a given text is included in the model's training data. We propose a sampling-based Pseudo-Likelihood (textbfSPL) method for MIA that calculates SPL using only the text generated by an LLM to detect leaks.
arXiv Detail & Related papers (2024-04-17T11:12:59Z)
Elephants Never Forget: Testing Language Models for Memorization of Tabular Data [21.912611415307644]
Large Language Models (LLMs) can be applied to a diverse set of tasks, but the critical issues of data contamination and memorization are often glossed over. We introduce a variety of different techniques to assess the degrees of contamination, including statistical tests for conditional distribution modeling and four tests that identify memorization.
arXiv Detail & Related papers (2024-03-11T12:07:13Z)
Large Language Models for Data Annotation and Synthesis: A Survey [49.8318827245266]
This survey focuses on the utility of Large Language Models for data annotation and synthesis.<n>It includes an in-depth taxonomy of data types that LLMs can annotate, a review of learning strategies for models utilizing LLM-generated annotations, and a detailed discussion of the primary challenges and limitations associated with using LLMs for data annotation and synthesis.
arXiv Detail & Related papers (2024-02-21T00:44:04Z)
LLMaAA: Making Large Language Models as Active Annotators [32.57011151031332]
We propose LLMaAA, which takes large language models as annotators and puts them into an active learning loop to determine what to annotate efficiently. We conduct experiments and analysis on two classic NLP tasks, named entity recognition and relation extraction. With LLMaAA, task-specific models trained from LLM-generated labels can outperform the teacher within only hundreds of annotated examples.
arXiv Detail & Related papers (2023-10-30T14:54:15Z)
Large Language Models as Data Preprocessors [9.99065004972981]
Large Language Models (LLMs) have marked a significant advancement in artificial intelligence. This study explores their potential in data preprocessing, a critical stage in data mining and analytics applications. We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques.
arXiv Detail & Related papers (2023-08-30T23:28:43Z)
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks. We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate. We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.