Fugu-MT 論文翻訳(概要): DLLM Agent: See Farther, Run Faster

論文の概要: DLLM Agent: See Farther, Run Faster

arxiv url: http://arxiv.org/abs/2602.07451v2
Date: Tue, 10 Feb 2026 02:35:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-11 15:31:42.93282
Title: DLLM Agent: See Farther, Run Faster
Title（参考訳）: DLLMエージェント:もっと遠くへ、より速く走る
Authors: Huiling Zhen, Weizhe Lin, Renxi Liu, Kai Han, Yiming Li, Yuchuan Tian, Hanting Chen, Xiaoguang Li, Xiaosong Li, Chen Chen, Xianzhi Yu, Mingxuan Yuan, Youliang Yan, Peifeng Qin, Jun Wang, Yu Wang, Dacheng Tao, Yunhe Wang,
Abstract要約: 拡散大言語モデル(DLLM)は、自己回帰(AR)デコーディングの代替として、魅力的な効率とモデリング特性を持つ。我々は、DLLMとARのバックボーンを同一のエージェントワークフロー内でインスタンス化することで、制御された環境でこれを研究する。 DLLMエージェントはARエージェントよりも平均30%以上速く、場合によっては8倍のスピードアップを達成している。
参考スコア（独自算出の注目度）: 94.74432470237817
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion large language models (DLLMs) have emerged as an alternative to autoregressive (AR) decoding with appealing efficiency and modeling properties, yet their implications for agentic multi-step decision making remain underexplored. We ask a concrete question: when the generation paradigm is changed but the agent framework and supervision are held fixed, do diffusion backbones induce systematically different planning and tool-use behaviors, and do these differences translate into end-to-end efficiency gains? We study this in a controlled setting by instantiating DLLM and AR backbones within the same agent workflow (DeepDiver) and performing matched agent-oriented fine-tuning on the same trajectory data, yielding diffusion-backed DLLM Agents and directly comparable AR agents. Across benchmarks and case studies, we find that, at comparable accuracy, DLLM Agents are on average over 30% faster end to end than AR agents, with some cases exceeding 8x speedup. Conditioned on correct task completion, DLLM Agents also require fewer interaction rounds and tool invocations, consistent with higher planner hit rates that converge earlier to a correct action path with less backtracking. We further identify two practical considerations for deploying diffusion backbones in tool-using agents. First, naive DLLM policies are more prone to structured tool-call failures, necessitating stronger tool-call-specific training to emit valid schemas and arguments. Second, for multi-turn inputs interleaving context and action spans, diffusion-style span corruption requires aligned attention masking to avoid spurious context-action information flow; without such alignment, performance degrades. Finally, we analyze attention dynamics across workflow stages and observe paradigm-specific coordination patterns, suggesting stronger global planning signals in diffusion-backed agents.
Abstract（参考訳）: 拡散大言語モデル (DLLM) は、魅力ある効率性とモデリング特性を持つ自己回帰的(AR)デコーディングの代替として登場したが、エージェント多段階決定へのその影響は未解明のままである。生成パラダイムが変更されたが、エージェントフレームワークと監督が固定された場合、拡散バックボーンは体系的に異なる計画とツール使用の振る舞いを誘導し、これらの違いがエンドツーエンドの効率向上に変換されるか? 我々は,同じエージェントワークフロー(DeepDiver)内でDLLMとARのバックボーンをインスタンス化し,一致するエージェント指向の微調整を行い,拡散支援されたDLLMエージェントと直接比較したARエージェントを生成することによって,制御された環境でこれを研究する。ベンチマークやケーススタディ全体では、DLLMエージェントはARエージェントよりも平均30%以上速く、場合によっては8倍のスピードアップを達成している。正しいタスク完了を条件として、DLLMエージェントは、バックトラックの少ない正しいアクションパスにより早く収束するプランナーヒット率と整合した、より少ないインタラクションラウンドとツール呼び出しも必要とします。さらに,ツール使用エージェントに拡散バックボーンを配置する際の2つの実践的考察について述べる。まず、単純なDLLMポリシーは、構造化されたツール呼び出し障害に傾向があり、有効なスキーマと引数を出力するために、より強力なツール呼び出し固有のトレーニングを必要とする。第二に、マルチターン入力がコンテキストとアクションスパンをインターリーブするためには、拡散スタイルのスパン破壊は、刺激的なコンテキストアクション情報の流れを避けるために、アライメントされたアライメントマスキングを必要とする。最後に、ワークフローステージ間の注意動態を分析し、パラダイム固有の調整パターンを観察し、拡散支援エージェントにおけるより強力なグローバル計画信号を提案する。

論文の概要: DLLM Agent: See Farther, Run Faster

関連論文リスト