Fugu-MT 論文翻訳(概要): ReDAct: Uncertainty-Aware Deferral for LLM Agents

論文の概要: ReDAct: Uncertainty-Aware Deferral for LLM Agents

arxiv url: http://arxiv.org/abs/2604.07036v1
Date: Wed, 08 Apr 2026 12:51:01 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.535225
Title: ReDAct: Uncertainty-Aware Deferral for LLM Agents
Title（参考訳）: LLMエージェントのための不確かさを意識したデフェラル
Authors: Dzianis Piatrashyn, Nikita Kotelevskii, Kirill Grishchenkov, Nikita Glazkov, Ivan Nasonov, Ilya Makarov, Timothy Baldwin, Preslav Nakov, Roman Vashurin, Maxim Panov,
Abstract要約: 本稿では、逐次意思決定問題を解決するためにReDAct(Reason-Defer-Act)を提案する。 ReDActでは、エージェントは2つのLSMを備えている: デフォルトで使用される小型で安価なモデルと、大きくて信頼性が高くて高価なモデルである。提案手法では,大モデルに対する決定の約15%のみを遅延させることで,推論コストを大幅に削減できることを示す。
参考スコア（独自算出の注目度）: 61.507376922278894
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, LLM-based agents have become increasingly popular across many applications, including complex sequential decision-making problems. However, they inherit the tendency of LLMs to hallucinate, leading to incorrect decisions. In sequential settings, even a single mistake can irreversibly degrade the trajectory, making hallucinations an even bigger problem. Although larger LLMs hallucinate less, they incur a significantly higher per-token cost. In this paper, we address this tradeoff by proposing ReDAct (Reason-Defer-Act). In ReDAct, an agent is equipped with two LLMs: a small, cheap model used by default, and a large, more reliable but expensive model. When the predictive uncertainty of the small model exceeds a calibrated threshold, the decision is deferred to the large model. We evaluate our approach in text-based embodied environments such as ALFWorld and MiniGrid and show that deferring only about 15% of decisions to the large model can match the quality of using it exclusively, while significantly reducing inference costs.
Abstract（参考訳）: 近年、LSMベースのエージェントは複雑なシーケンシャルな意思決定問題を含む多くのアプリケーションで人気が高まっている。しかし、彼らはLLMの幻覚の傾向を継承し、誤った決定を下す。連続的な設定では、1つの誤りでさえ軌道を不可逆的に劣化させ、幻覚はさらに大きな問題となる。より大きなLSMは幻覚を減少させるが、トーケン当たりのコストは著しく高い。本稿では,ReDAct(Reason-Defer-Act)を提案することによって,このトレードオフに対処する。 ReDActでは、エージェントは2つのLSMを備えている: デフォルトで使用される小型で安価なモデルと、大きくて信頼性が高くて高価なモデルである。小モデルの予測不確かさが校正しきい値を超えると、決定は大モデルに延期される。 ALFWorldやMiniGridといったテキストベースのエンボディ環境でのアプローチを評価し,大モデルへの決定の約15%は,その使用品質に比例し,推論コストを大幅に削減できることを示した。

論文の概要: ReDAct: Uncertainty-Aware Deferral for LLM Agents

関連論文リスト