Fugu-MT 論文翻訳(概要): Watermarking LLM Agent Trajectories

論文の概要: Watermarking LLM Agent Trajectories

arxiv url: http://arxiv.org/abs/2602.18700v1
Date: Sat, 21 Feb 2026 03:12:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-24 17:42:02.247378
Title: Watermarking LLM Agent Trajectories
Title（参考訳）: 透かしLDMエージェントの軌道
Authors: Wenlong Meng, Chen Gong, Terry Yue Zhuo, Fan Zhang, Kecen Li, Zheng Liu, Zhou Yang, Chengkun Wei, Wenzhi Chen,
Abstract要約: 本稿では,エージェント軌跡データセットに適した最初の透かし手法であるActHookを紹介する。 ActHookはソフトウェアエンジニアリングのフック機構にインスパイアされ、秘密の入力キーによって起動されるフックアクションを組み込む。アクティベーションキーが存在する場合、透かし軌道上で訓練されたLDMエージェントは、これらのフックアクションを著しく高い速度で生成することができる。
参考スコア（独自算出の注目度）: 25.0049018162327
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM agents rely heavily on high-quality trajectory data to guide their problem-solving behaviors, yet producing such data requires substantial task design, high-capacity model generation, and manual filtering. Despite the high cost of creating these datasets, existing literature has overlooked copyright protection for LLM agent trajectories. This gap leaves creators vulnerable to data theft and makes it difficult to trace misuse or enforce ownership rights. This paper introduces ActHook, the first watermarking method tailored for agent trajectory datasets. Inspired by hook mechanisms in software engineering, ActHook embeds hook actions that are activated by a secret input key and do not alter the original task outcome. Like software execution, LLM agents operate sequentially, allowing hook actions to be inserted at decision points without disrupting task flow. When the activation key is present, an LLM agent trained on watermarked trajectories can produce these hook actions at a significantly higher rate, enabling reliable black-box detection. Experiments on mathematical reasoning, web searching, and software engineering agents show that ActHook achieves an average detection AUC of 94.3 on Qwen-2.5-Coder-7B while incurring negligible performance degradation.
Abstract（参考訳）: LLMエージェントは問題解決行動のガイドとして高品質な軌道データに大きく依存するが、そのようなデータを生成するには相当なタスク設計、高容量モデル生成、手動フィルタリングが必要である。これらのデータセットを作成するコストが高いにもかかわらず、既存の文献はLLMエージェントの軌跡に対する著作権保護を見落としている。このギャップは、データ盗難に弱いクリエーターを残し、不正使用の追跡や所有権の強制を困難にする。本稿では,エージェント軌跡データセットに適した最初の透かし手法であるActHookを紹介する。 ActHookは、ソフトウェアエンジニアリングのフック機構にインスパイアされ、秘密の入力キーによって起動され、元のタスク結果を変えないフックアクションを組み込む。ソフトウェア実行と同様に、LLMエージェントはシーケンシャルに動作し、タスクフローを中断することなく、決定ポイントでフックアクションを挿入できる。アクティベーションキーが存在する場合、ウォーターマークされた軌道上で訓練されたLDMエージェントは、これらのフックアクションを極めて高い速度で生成し、信頼性の高いブラックボックス検出を可能にする。数学的推論、Web検索、ソフトウェア工学エージェントの実験により、ActHookはQwen-2.5-Coder-7B上で平均94.3のAUCを達成し、無視できる性能劣化を引き起こす。

論文の概要: Watermarking LLM Agent Trajectories

関連論文リスト