Fugu-MT 論文翻訳(概要): MAXS: Meta-Adaptive Exploration with LLM Agents

論文の概要: MAXS: Meta-Adaptive Exploration with LLM Agents

arxiv url: http://arxiv.org/abs/2601.09259v1
Date: Wed, 14 Jan 2026 07:48:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-15 18:59:20.319649
Title: MAXS: Meta-Adaptive Exploration with LLM Agents
Title（参考訳）: MAXS: LLMエージェントによるメタ適応探索
Authors: Jian Zhang, Zhiyuan Wang, Zhangqi Wang, Yu He, Haoran Luo, li yuan, Lingling Zhang, Rui Mao, Qika Lin, Jun Liu,
Abstract要約: MaxSはLarge Language Model (LLM) Agentsをベースにしたメタ適応推論フレームワークである。 MAXSは、いくつかのステップを進む推論パスを拡張するために、ルックアヘッド戦略を採用している。ステップの一貫性のばらつきとステップ間のトレンドスロープを組み合わせることで、安定で一貫性があり、高い値の推論ステップを共同で選択する。
参考スコア（独自算出の注目度）: 48.04723638253802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Model (LLM) Agents exhibit inherent reasoning abilities through the collaboration of multiple tools. However, during agent inference, existing methods often suffer from (i) locally myopic generation, due to the absence of lookahead, and (ii) trajectory instability, where minor early errors can escalate into divergent reasoning paths. These issues make it difficult to balance global effectiveness and computational efficiency. To address these two issues, we propose meta-adaptive exploration with LLM agents https://github.com/exoskeletonzj/MAXS, a meta-adaptive reasoning framework based on LLM Agents that flexibly integrates tool execution and reasoning planning. MAXS employs a lookahead strategy to extend reasoning paths a few steps ahead, estimating the advantage value of tool usage, and combines step consistency variance and inter-step trend slopes to jointly select stable, consistent, and high-value reasoning steps. Additionally, we introduce a trajectory convergence mechanism that controls computational cost by halting further rollouts once path consistency is achieved, enabling a balance between resource efficiency and global effectiveness in multi-tool reasoning. We conduct extensive empirical studies across three base models (MiMo-VL-7B, Qwen2.5-VL-7B, Qwen2.5-VL-32B) and five datasets, demonstrating that MAXS consistently outperforms existing methods in both performance and inference efficiency. Further analysis confirms the effectiveness of our lookahead strategy and tool usage.
Abstract（参考訳）: 大規模言語モデル(LLM)エージェントは、複数のツールの協調を通じて固有の推論能力を示す。しかし、エージェント推論の間、既存のメソッドがしばしば苦しむ (i)外見の欠如による局所的な筋力発生、及び (II)軌道不安定性(英語版)は、小さな初期誤差が分岐した推論経路にエスカレートする可能性がある。これらの問題により、グローバルな効率性と計算効率のバランスをとるのが難しくなる。これら2つの問題に対処するため,私たちは,LLMエージェントを柔軟に統合したメタ適応推論フレームワークである,https://github.com/exoskeletonzj/MAXSを提案する。 MAXSは、先進的な推論パスを拡張し、ツール使用の利点を推定し、ステップ一貫性のばらつきとステップ間トレンドスロープを組み合わせて、安定した、一貫性のある、高価値な推論ステップを共同で選択する。さらに,経路整合性が達成されれば,さらなるロールアウトを停止して計算コストを制御するトラジェクトリ収束機構を導入し,マルチツール推論における資源効率とグローバル効率のバランスをとる。我々は、3つのベースモデル(MiMo-VL-7B、Qwen2.5-VL-7B、Qwen2.5-VL-32B)と5つのデータセットにまたがって広範な実験を行い、MAXSが性能と推論効率の両方において既存の手法より一貫して優れていることを示した。さらなる分析により、我々のルックアヘッド戦略とツール利用の有効性が確かめられる。

論文の概要: MAXS: Meta-Adaptive Exploration with LLM Agents

関連論文リスト