Fugu-MT 論文翻訳(概要): Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

論文の概要: Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

arxiv url: http://arxiv.org/abs/2603.28345v1
Date: Mon, 30 Mar 2026 12:14:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-31 23:18:45.37995
Title: Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code
Title（参考訳）: NL/PLディバイドを交差する:LLM符号化符号におけるNL/PL境界における情報フロー解析
Authors: Zihao Xu, Xiao Cheng, Ruijie Meng, Yuekang Li,
Abstract要約: LLM API呼び出しは、既存のプログラム分析が渡せない境界を作る。この境界を橋渡しする最初の情報フロー法を提案する。実世界の4,154個のPythonファイルから9,083個のプレースホルダー出力ペアをラベル付けします。
参考スコア（独自算出の注目度）: 13.616177586397656
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that no existing program analysis can cross: runtime values enter a natural-language prompt, undergo opaque processing inside the LLM, and re-emerge as code, SQL, JSON, or text that the program consumes. Every analysis that tracks data across function boundaries, including taint analysis, program slicing, dependency analysis, and change-impact analysis, relies on dataflow summaries of callee behavior. LLM calls have no such summaries, breaking all of these analyses at what we call the NL/PL boundary. We present the first information flow method to bridge this boundary. Grounded in quantitative information flow theory, our taxonomy defines 24 labels along two orthogonal dimensions: information preservation level (from lexically preserved to fully blocked) and output modality (natural language, structured format, executable artifact). We label 9,083 placeholder-output pairs from 4,154 real-world Python files and validate reliability with Cohen's $κ= 0.82$ and near-complete coverage (0.01\% unclassifiable). We demonstrate the taxonomy's utility on two downstream applications: (1)~a two-stage taint propagation pipeline combining taxonomy-based filtering with LLM verification achieves $F_1 = 0.923$ on 353 expert-annotated pairs, with cross-language validation on six real-world OpenClaw prompt injection cases further confirming effectiveness; (2)~taxonomy-informed backward slicing reduces slice size by a mean of 15\% in files containing non-propagating placeholders. Per-label analysis reveals that four blocked labels account for nearly all non-propagating cases, providing actionable filtering criteria for tool builders.
Abstract（参考訳）: LLM APIコールはユビキタスなプログラムコンストラクトになりつつあるが、既存のプログラム分析ではクロスできないバウンダリを生成する。ランタイム値は自然言語プロンプトを入力し、LLM内部で不透明な処理を実行し、プログラムが消費するコード、SQL、JSON、テキストとして再エマージする。テナント分析、プログラムスライシング、依存性分析、変更影響分析など、関数境界を越えたデータを追跡するすべての分析は、呼び出し者の振る舞いに関するデータフローの要約に依存する。 LLM呼び出しにはそのような要約はなく、NL/PL境界と呼ばれる全ての解析を破る。この境界を橋渡しする最初の情報フロー法を提案する。定量的情報フロー理論に基づく分類学では、情報保存レベル(語彙的に保存されるものから完全にブロックされるものまで)と出力モダリティ(自然言語、構造化形式、アーティファクト)の2つの直交次元に沿って24のラベルを定義している。実世界の4,154のPythonファイルから9,083のプレースホルダー出力ペアをラベル付けし、Cohenの$κ= 0.82$とほぼ完全なカバレッジ(0.01\%は未分類)で信頼性を検証する。 LLM検証と分類に基づくフィルタリングを組み合わせた2段階のテナント伝搬パイプラインは、353のエキスパートアノテーションペアに対して$F_1 = 0.923$を達成し、実際の6つのOpenClawインジェクションケースに対してクロス言語検証を行い、さらに有効性を確認した。ラベルごとの分析では、4つのブロックされたラベルがほぼすべての非プロパゲートケースをカバーし、ツールビルダーに対して実行可能なフィルタリング基準を提供することが明らかになった。

論文の概要: Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

関連論文リスト