Fugu-MT 論文翻訳(概要): Emerging from Ground: Addressing Intent Deviation in Tool-Using Agents via Deriving Real Calls into Virtual Trajectories

論文の概要: Emerging from Ground: Addressing Intent Deviation in Tool-Using Agents via Deriving Real Calls into Virtual Trajectories

arxiv url: http://arxiv.org/abs/2601.15120v1
Date: Wed, 21 Jan 2026 15:58:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-22 21:27:50.439996
Title: Emerging from Ground: Addressing Intent Deviation in Tool-Using Agents via Deriving Real Calls into Virtual Trajectories
Title（参考訳）: 地上からの創発:リアルコールを仮想軌道に誘導するツール利用エージェントの意図的逸脱に対処する
Authors: Qian Xiong, Yuekai Huang, Yujia Zheng, Tianhao Li, Ziyou Jiang, Zhiyuan Chang, Zhaoyang Li, Huanxiang Feng, Mingyang Li,
Abstract要約: 意図のずれを軽減するために考案された「リアル・トゥ・ヴァーチャル」法。 RISE(Real-to-Virtual)は,意図の偏差を緩和する手法である。
参考スコア（独自算出の注目度）: 22.825818628788948
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs have advanced tool-using agents for real-world applications, yet they often lead to unexpected behaviors or results. Beyond obvious failures, the subtle issue of "intent deviation" severely hinders reliable evaluation and performance improvement. Existing post-training methods generally leverage either real system samples or virtual data simulated by LLMs. However, the former is costly due to reliance on hand-crafted user requests, while the latter suffers from distribution shift from the real tools in the wild. Additionally, both methods lack negative samples tailored to intent deviation scenarios, hindering effective guidance on preference learning. We introduce RISE, a "Real-to-Virtual" method designed to mitigate intent deviation. Anchoring on verified tool primitives, RISE synthesizes virtual trajectories and generates diverse negative samples through mutation on critical parameters. With synthetic data, RISE fine-tunes backbone LLMs via the two-stage training for intent alignment. Evaluation results demonstrate that data synthesized by RISE achieve promising results in eight metrics covering user requires, execution trajectories and agent responses. Integrating with training, RISE achieves an average 35.28% improvement in Acctask (task completion) and 23.27% in Accintent (intent alignment), outperforming SOTA baselines by 1.20--42.09% and 1.17--54.93% respectively.
Abstract（参考訳）: LLMは、現実世界のアプリケーションのための高度なツール利用エージェントを持っているが、しばしば予期せぬ振る舞いや結果をもたらす。の微妙な問題は、信頼性の高い評価とパフォーマンス改善を著しく妨げます。既存のポストトレーニング手法は、通常、実システムサンプルまたはLLMでシミュレートされた仮想データを利用する。しかし、後者は手作りのユーザーリクエストに頼っているためコストがかかる。さらに、どちらの手法にも意図的偏差シナリオに適した負のサンプルが欠如しており、嗜好学習の効果的な指導を妨げている。 RISE(Real-to-Virtual)は,意図の偏差を緩和する手法である。検証されたツールプリミティブに基づいて、RISEは仮想軌跡を合成し、臨界パラメータの突然変異を通じて多様な負のサンプルを生成する。合成データにより、RISEはインテントアライメントのための2段階のトレーニングを通じて、微細な背骨LPMを作製した。 RISEによって合成されたデータは,ユーザ要求,実行軌跡,エージェント応答を含む8つの指標で有望な結果が得られることを示す。 RISEはトレーニングと統合し、平均35.28%のAcctask(タスク完了)と23.27%のAccintent(インテントアライメント)を達成し、SOTAベースラインをそれぞれ1.20--42.09%、そして1.17-54.93%で上回っている。

論文の概要: Emerging from Ground: Addressing Intent Deviation in Tool-Using Agents via Deriving Real Calls into Virtual Trajectories

関連論文リスト