Fugu-MT 論文翻訳(概要): (Token-Level) InfoRMIA: Stronger Membership Inference and Memorization Assessment for LLMs

論文の概要: (Token-Level) InfoRMIA: Stronger Membership Inference and Memorization Assessment for LLMs

arxiv url: http://arxiv.org/abs/2510.05582v2
Date: Thu, 09 Oct 2025 10:03:33 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 12:56:53.566794
Title: (Token-Level) InfoRMIA: Stronger Membership Inference and Memorization Assessment for LLMs
Title（参考訳）: (Token-Level)InfoRMIA:LLMのより強力なメンバーシップ推論とメモリ化評価
Authors: Jiashu Tao, Reza Shokri,
Abstract要約: 大規模言語モデル(LLM)は、ほぼすべての利用可能なデータに基づいてトレーニングされている。プライバシを定量化する標準的な方法は、メンバシップ推論攻撃である。本稿では、会員推論の原理的情報理論の定式化であるInfoRMIAを紹介する。
参考スコア（独自算出の注目度）: 13.601386341584545
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning models are known to leak sensitive information, as they inevitably memorize (parts of) their training data. More alarmingly, large language models (LLMs) are now trained on nearly all available data, which amplifies the magnitude of information leakage and raises serious privacy risks. Hence, it is more crucial than ever to quantify privacy risk before the release of LLMs. The standard method to quantify privacy is via membership inference attacks, where the state-of-the-art approach is the Robust Membership Inference Attack (RMIA). In this paper, we present InfoRMIA, a principled information-theoretic formulation of membership inference. Our method consistently outperforms RMIA across benchmarks while also offering improved computational efficiency. In the second part of the paper, we identify the limitations of treating sequence-level membership inference as the gold standard for measuring leakage. We propose a new perspective for studying membership and memorization in LLMs: token-level signals and analyses. We show that a simple token-based InfoRMIA can pinpoint which tokens are memorized within generated outputs, thereby localizing leakage from the sequence level down to individual tokens, while achieving stronger sequence-level inference power on LLMs. This new scope rethinks privacy in LLMs and can lead to more targeted mitigation, such as exact unlearning.
Abstract（参考訳）: 機械学習モデルは、トレーニングデータを必然的に記憶(一部)するため、機密情報を漏洩することが知られている。さらに注意すべきは、大きな言語モデル(LLM)が、ほぼすべての利用可能なデータでトレーニングされていることだ。したがって、LCMのリリース前にプライバシーリスクを定量化することは、これまで以上に重要である。プライバシを定量化する標準的な方法は、メンバシップ推論アタック(メンバシップ推論アタック)であり、最先端のアプローチはロバストメンバーシップ推論アタック(RMIA)である。本稿では,メンバーシップ推論の原理的情報理論定式化であるInfoRMIAについて述べる。提案手法はベンチマークでRMIAを常に上回り,計算効率も向上した。本論文の第2部では, シークエンスレベルのメンバシップ推論を, リークを計測するためのゴールドスタンダードとして扱うことの限界を明らかにした。 LLMにおけるメンバシップと記憶の新たな視点として,トークンレベルの信号と分析を提案する。単純なトークンベースのInfoRMIAは、どのトークンが生成された出力に記憶されているかをピンポイントで特定でき、これにより、LLM上でより強いシーケンスレベルの推論能力を実現しつつ、シーケンスレベルのリークを個々のトークンにローカライズすることができる。この新たなスコープは、LLMのプライバシを再考し、正確なアンラーニングなど、よりターゲットを絞った緩和につながる可能性がある。

論文の概要: (Token-Level) InfoRMIA: Stronger Membership Inference and Memorization Assessment for LLMs

関連論文リスト