Fugu-MT 論文翻訳(概要): MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

論文の概要: MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

arxiv url: http://arxiv.org/abs/2511.11793v2
Date: Tue, 18 Nov 2025 15:45:29 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-19 13:59:16.69758
Title: MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
Title（参考訳）: MiroThinker: モデル、コンテキスト、インタラクティブスケーリングを通じて、オープンソースリサーチエージェントのパフォーマンス境界をプッシュする
Authors: MiroMind Team, Song Bai, Lidong Bing, Carson Chen, Guanzheng Chen, Yuntao Chen, Zhe Chen, Ziyi Chen, Jifeng Dai, Xuan Dong, Wenhan Dou, Yue Deng, Yunjie Fu, Junqi Ge, Chenxia Han, Tammy Huang, Zhenhang Huang, Jerry Jiao, Shilei Jiang, Tianyu Jiao, Xiaoqi Jian, Lei Lei, Ruilin Li, Ryan Luo, Tiantong Li, Xiang Lin, Ziyuan Liu, Zhiqi Li, Jie Ni, Qiang Ren, Pax Sun, Shiqian Su, Chenxin Tao, Bin Wang, Hellen Wang, Haonan Wang, James Wang, Jin Wang, Jojo Wang, Letian Wang, Shizun Wang, Weizhi Wang, Zixuan Wang, Jinfan Xu, Sen Xing, Chenyu Yang, Hai Ye, Jiaheng Yu, Yue Yu, Muyan Zhong, Tianchen Zhao, Xizhou Zhu, Yanpeng Zhou, Yifan Zhang, Zhi Zhu,
Abstract要約: MiroThinkerは、ツール拡張推論と情報検索機能を向上させるために設計されたオープンソースの研究エージェントである。モデルサイズやコンテキスト長のみをスケールアップする以前のエージェントとは異なり、MiroThinker氏はモデルレベルでのインタラクションスケーリングについて検討している。
参考スコア（独自算出の注目度）: 115.74855199827596
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or context length, MiroThinker explores interaction scaling at the model level, systematically training the model to handle deeper and more frequent agent-environment interactions as a third dimension of performance improvement. Unlike LLM test-time scaling, which operates in isolation and risks degradation with longer reasoning chains, interactive scaling leverages environment feedback and external information acquisition to correct errors and refine trajectories. Through reinforcement learning, the model achieves efficient interaction scaling: with a 256K context window, it can perform up to 600 tool calls per task, enabling sustained multi-turn reasoning and complex real-world research workflows. Across four representative benchmarks-GAIA, HLE, BrowseComp, and BrowseComp-ZH-the 72B variant achieves up to 81.9%, 37.7%, 47.1%, and 55.6% accuracy respectively, surpassing previous open-source agents and approaching commercial counterparts such as GPT-5-high. Our analysis reveals that MiroThinker benefits from interactive scaling consistently: research performance improves predictably as the model engages in deeper and more frequent agent-environment interactions, demonstrating that interaction depth exhibits scaling behaviors analogous to model size and context length. These findings establish interaction scaling as a third critical dimension for building next-generation open research agents, complementing model capacity and context windows.
Abstract（参考訳）: ツール拡張推論と情報検索機能の向上を目的とした,オープンソースの研究エージェントであるMiroThinker v1.0を紹介する。モデルサイズやコンテキスト長のみをスケールアップする以前のエージェントとは異なり、MiroThinker氏はモデルレベルでのインタラクションスケーリングを検討し、パフォーマンス改善の第3の次元として、より深く、より頻繁なエージェント環境インタラクションを扱うようにモデルを体系的に訓練する。 LLMテストタイムスケーリングは、より長い推論チェーンで独立して動作し、劣化するリスクを負うが、インタラクティブスケーリングは環境フィードバックと外部情報取得を活用してエラーを訂正し、トラジェクトリを洗練する。 256Kコンテキストウィンドウを使用して、タスク毎に最大600のツールコールを実行し、持続的なマルチターン推論と複雑な現実世界の研究ワークフローを可能にする。 GAIA、HLE、BrowseComp、BrowseComp、BrowseComp-ZHの4つのベンチマークは、それぞれ81.9%、37.7%、47.1%、55.6%の精度で、以前のオープンソースエージェントを上回り、GPT-5-highのような商用エージェントに近づいた。モデルがより深く、より頻繁なエージェント環境の相互作用に関与することにより、モデルのサイズやコンテキスト長に類似したスケーリングの挙動を示すことが示される。これらの知見は,次世代オープンリサーチエージェント構築のための第3の重要次元としてインタラクションスケーリングを確立し,モデル容量とコンテキストウィンドウを補完する。

論文の概要: MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

関連論文リスト