Fugu-MT 論文翻訳(概要): GIF: Locally Sound Geometric Information Flow Control for LLMs

論文の概要: GIF: Locally Sound Geometric Information Flow Control for LLMs

arxiv url: http://arxiv.org/abs/2606.23277v1
Date: Mon, 22 Jun 2026 12:54:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-26 21:11:01.842048
Title: GIF: Locally Sound Geometric Information Flow Control for LLMs
Title（参考訳）: GIF:LLMのための局所的幾何学的情報フロー制御
Authors: Adam Storek, Nikolaus Holzer, Zhuo Zhang, Suman Jana,
Abstract要約: 大規模言語モデルは、エージェントシステムにおけるセンシティブなデータ、信頼できない入力、特権的なアクション間の相互作用を仲介する。近年のインフォメーションフロー制御(IFC)ベースのディフェンスは、モデル自体を通しての情報フローを推論するための原則的なセマンティック基盤を欠いている。本稿では,入力トークンから出力への情報フローを追跡するセマンティックフレームワークであるGeometric Information Flow (GIF)を提案する。
参考スコア（独自算出の注目度）: 12.598128612710523
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models increasingly mediate interactions between sensitive data, untrusted inputs, and privileged actions in agentic systems, creating security and privacy risks. These range from prompt injections that manipulate downstream tool use to leakage of confidential information through model outputs. Recent Information Flow Control (IFC)-based defenses show promise but lack a principled semantic foundation for reasoning about information flow through the model itself. Since any input token may influence any output token in an autoregressive LLM, existing approaches suffer from severe taint explosion. We present Geometric Information Flow (GIF), a semantic framework for tracking information flow from input tokens to outputs. GIF uses the LLM Jacobian and local output geometry to upper-bound the Shannon mutual information between perturbed input spans and model outputs, yielding a scalable measure computable on large models via automatic differentiation and low-rank approximation. Unlike attention-based or correlational attribution heuristics, GIF satisfies local geometric soundness, and we provide a fully mechanized Lean 4 proof that it upper-bounds the true information flow induced by a given prompt under local regularity assumptions. We evaluate GIF on integrity and confidentiality tasks across multiple prompt-injection and privacy-leakage benchmarks. GIF achieves near-perfect recall even without a downstream declassifier, outperforming attention-based baselines. Combined with lightweight LLM-based declassifiers, it matches or exceeds the F1 of direct LLM-as-judge baselines such as GPT-5.5 xhigh reasoning while using up to 81x lower token cost. GIF flows detected with small surrogate models transfer to larger state-of-the-art models and other model families, even when the surrogate is up to 200x smaller, suggesting black-box deployment without gradient access.
Abstract（参考訳）: 大規模言語モデルは、機密データ、信頼できない入力、エージェントシステムにおける特権的行動の間の相互作用をますます仲介し、セキュリティとプライバシのリスクを生み出す。これらは、下流ツールを操作できるプロンプトインジェクションから、モデル出力による機密情報の漏洩まで幅広い。近年のインフォメーションフロー制御(IFC)ベースのディフェンスは、モデル自体を通しての情報フローを推論するための原則的なセマンティック基盤を欠いている。入力トークンは自己回帰LDMの任意の出力トークンに影響を与える可能性があるため、既存のアプローチは深刻な汚染による爆発に悩まされる。本稿では,入力トークンから出力への情報フローを追跡するセマンティックフレームワークであるGeometric Information Flow (GIF)を提案する。 GIF は LLM Jacobian と局所出力幾何を用いて摂動入力スパンとモデル出力の間のシャノンの相互情報を上位にバウンドし、自動微分と低ランク近似によって大規模モデル上で計算可能なスケーラブルな測度を与える。注意に基づくあるいは相関的な帰属ヒューリスティックスとは異なり、GIFは局所的な幾何学的健全さを満足し、局所正規性仮定の下で与えられたプロンプトによって引き起こされる真の情報フローを上界とする完全に機械化されたLean 4証明を提供する。我々は,複数のプロンプトインジェクションおよびプライバシ推論ベンチマークにおいて,完全性および機密性に関するGIFを評価した。 GIFは、下流の非分類装置がなくてもほぼ完璧なリコールを実現し、注目ベースのベースラインを上回っている。軽量なLCMベースの非分類器と組み合わせて、最大81倍のトークンコストを使用しながら、GPT-5.5 xhigh推論のような直接LCM-as-judgeベースラインのF1と一致するか、超える。小さな代理モデルで検出されたGIFフローは、より大きな最先端モデルや他のモデルファミリに転送される。

論文の概要: GIF: Locally Sound Geometric Information Flow Control for LLMs

関連論文リスト