Fugu-MT 論文翻訳(概要): Extracting Training Data from Diffusion Language Models via Infilling

論文の概要: Extracting Training Data from Diffusion Language Models via Infilling

arxiv url: http://arxiv.org/abs/2605.24173v1
Date: Fri, 22 May 2026 19:46:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:17.663306
Title: Extracting Training Data from Diffusion Language Models via Infilling
Title（参考訳）: 埋め込みによる拡散言語モデルからの学習データ抽出
Authors: Yihan Wang, N. Asokan,
Abstract要約: 任意のバイナリマスクによってパラメータ化されたデータ抽出プロトコルであるemphinfilling extractを導入する。エッジ条件マスクはプレフィックス条件マスクよりも最大3倍の冗長配列を抽出する。特に,個人識別可能な情報が再現された訓練データにアクセス可能な現実的な敵が,DLMから再実行されたメールアドレスを抽出する際に,より高いリコールを達成できることを示す。
参考スコア（独自算出の注目度）: 29.12248380721338
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Memorization in large language models has been studied almost exclusively through prefix-conditioned extraction, a natural choice for autoregressive models. However, diffusion language models (DLMs) can denoise masked tokens at arbitrary positions. Thus, prefix-only probing reveals only one facet of memorization in DLMs and significantly underestimates the risk of training-data extraction. In order to realistically model extractability of training data in DLMs, we introduce \emph{infilling extraction}, a data-extraction protocol parameterized by an arbitrary binary mask that subsumes prefix-only probing and accounts for the bidirectional inductive bias of DLMs. Instantiating it on LLaDA-8B and Dream-7B across five extraction modes, three training pipelines, and three corpora covering verbatim and partial leakage, we find that mask geometry governs extractability: edge-conditioned masks \emph{extract up to three times more} verbatim sequences than prefix-conditioned ones, and bidirectional access opens channels inaccessible in autoregressive models. In particular, we show that a realistic adversary with access to training data where personally identifiable information has been redacted, can even achieve higher recall on extracting redacted email addresses from DLMs than from scale-matched autoregressive models. Tunable parameters for decoding measurably affect extraction performance, while a follow-up supervised finetuning stage does not eliminate the prior memorization.
Abstract（参考訳）: 大規模言語モデルの記憶は、自己回帰モデルに対する自然な選択であるプレフィックス条件付き抽出によって、ほとんど研究されている。しかし、拡散言語モデル(DLM)は任意の位置でマスク付きトークンを識別することができる。したがって、プレフィックスのみの探索はDLMにおける記憶の1つの側面のみを明らかにし、トレーニングデータ抽出のリスクを著しく過小評価する。 DLMにおけるトレーニングデータの抽出可能性を現実的にモデル化するために、任意のバイナリマスクによってパラメータ化されたデータ抽出プロトコルである \emph{infilling extract} を導入する。 LLaDA-8BとDream-7Bでは,5つの抽出モード,3つの訓練パイプライン,および3つのコーパスにおいて,マスク形状が抽出可能性を支配することが判明した。特に,個人識別可能な情報を再現したトレーニングデータにアクセスする現実的な敵は,スケールマッチングされた自己回帰モデルよりも,DLMから再実行されたメールアドレスを抽出する際のリコールも高いことを示す。復号化のための可変パラメータは抽出性能に影響を及ぼすが、追従教師付微調整段階は先行記憶を排除しない。

論文の概要: Extracting Training Data from Diffusion Language Models via Infilling

関連論文リスト