Related papers: Membership Inference on LLMs in the Wild

Membership Inference on LLMs in the Wild

URL: http://arxiv.org/abs/2601.11314v1
Date: Fri, 16 Jan 2026 14:10:46 GMT
Title: Membership Inference on LLMs in the Wild
Authors: Jiatong Yi, Yanyang Li,
Abstract summary: Membership Inference Attacks (MIAs) act as a crucial auditing tool for the opaque training data of Large Language Models (LLMs)<n>We propose SimMIA, a robust MIA framework tailored for this text-only regime by leveraging an advanced sampling strategy and scoring mechanism.<n>We present WikiMIA-25, a new benchmark curated to evaluate MIA performance on modern proprietary LLMs.
Score: 7.333405847597631
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Membership Inference Attacks (MIAs) act as a crucial auditing tool for the opaque training data of Large Language Models (LLMs). However, existing techniques predominantly rely on inaccessible model internals (e.g., logits) or suffer from poor generalization across domains in strict black-box settings where only generated text is available. In this work, we propose SimMIA, a robust MIA framework tailored for this text-only regime by leveraging an advanced sampling strategy and scoring mechanism. Furthermore, we present WikiMIA-25, a new benchmark curated to evaluate MIA performance on modern proprietary LLMs. Experiments demonstrate that SimMIA achieves state-of-the-art results in the black-box setting, rivaling baselines that exploit internal model information.

Related papers

In-Context Probing for Membership Inference in Fine-Tuned Language Models [14.590625376049955]
Membership inference attacks (MIAs) pose a critical privacy threat to fine-tuned large language models (LLMs)<n>We propose ICP-MIA, a novel MIA framework grounded in the theory of training dynamics.<n>ICP-MIA significantly outperforms prior black-box MIAs, particularly at low false positive rates.
arXiv Detail & Related papers (2025-12-18T08:26:26Z)
Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models [3.9448289587779404]
Logit-based membership inference attacks (MIAs) have become a widely adopted approach for assessing data exposure in large language models (LLMs)<n>We present the first comprehensive evaluation of extending these text-based MIA methods to multimodal settings.
arXiv Detail & Related papers (2025-12-02T14:11:51Z)
EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs [10.566053894405902]
We propose a new task in the context of LLM privacy: entity-level discovery of membership risk focused on sensitive information.<n>Existing methods for MIA can detect the presence of entire prompts or documents in the LLM training data, but they fail to capture risks at a finer granularity.<n>We construct a benchmark dataset for the evaluation of MIA methods on this task.
arXiv Detail & Related papers (2025-10-31T18:50:47Z)
OpenLVLM-MIA: A Controlled Benchmark Revealing the Limits of Membership Inference Attacks on Large Vision-Language Models [8.88331104584743]
OpenLVLM-MIA is a new benchmark that highlights fundamental challenges in evaluating membership inference attacks (MIA) against large vision-language models (LVLMs)<n>We introduce a controlled benchmark of 6,000 images where the distributions of member and non-member samples are carefully balanced, and ground-truth membership labels are provided across three distinct training stages.<n> Experiments using OpenLVLM-MIA demonstrated that the performance of state-of-the-art MIA methods converged to random chance under unbiased conditions.
arXiv Detail & Related papers (2025-10-18T01:39:28Z)
Membership Inference Attack against Large Language Model-based Recommendation Systems: A New Distillation-based Paradigm [0.0]
Membership Inference Attack (MIA) aims to determine whether a specific data sample was included in the training dataset of a target model.<n>This paper introduces a novel knowledge distillation-based MIA paradigm tailored for Large Language Model (LLM)-based recommendation systems.
arXiv Detail & Related papers (2025-09-16T09:36:43Z)
On the Evolution of Federated Post-Training Large Language Models: A Model Accessibility View [82.19096285469115]
Federated Learning (FL) enables training models across decentralized data silos while preserving client data privacy.<n>Recent research has explored efficient methods for post-training large language models (LLMs) within FL to address computational and communication challenges.<n>An inference-only paradigm (black-box FedLLM) has emerged to address these limitations.
arXiv Detail & Related papers (2025-08-22T09:52:31Z)
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models [76.72220653705679]
We introduce MCPEval, an open-source framework that automates end-to-end task generation and deep evaluation of intelligent agents.<n> MCPEval standardizes metrics, seamlessly integrates with native agent tools, and eliminates manual effort in building evaluation pipelines.<n> Empirical results across five real-world domains show its effectiveness in revealing nuanced, domain-specific performance.
arXiv Detail & Related papers (2025-07-17T05:46:27Z)
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z)
MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM)<n>MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task.<n>LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z)
Detecting Training Data of Large Language Models via Expectation Maximization [62.28028046993391]
We introduce EM-MIA, a novel membership inference method that iteratively refines membership scores and prefix scores via an expectation-maximization algorithm.<n> EM-MIA achieves state-of-the-art results on WikiMIA.
arXiv Detail & Related papers (2024-10-10T03:31:16Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data. We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters. We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z)
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.