Fugu-MT 論文翻訳(概要): ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning

論文の概要: ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning

arxiv url: http://arxiv.org/abs/2510.07768v1
Date: Thu, 09 Oct 2025 04:11:16 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:14.860989
Title: ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning
Title（参考訳）: ToolLibGen: LLM推論のためのスケーラブルな自動ツール作成と集約
Authors: Murong Yue, Zhiwei Liu, Liangwei Yang, Jianguo Zhang, Zuxin Liu, Haolin Chen, Ziyu Yao, Silvio Savarese, Caiming Xiong, Shelby Heinecke, Huan Wang,
Abstract要約: 外部ツールを備えたLarge Language Models (LLM) は、複雑な推論タスクにおけるパフォーマンスの向上を実証している。このツールに強化された推論が広く採用されるのは、ドメイン固有のツールが不足しているためである。構造化ツールライブラリに非構造化ツールのコレクションを自動的に組み込むための体系的なアプローチを提案する。
参考スコア（独自算出の注目度）: 80.10274552177096
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) equipped with external tools have demonstrated enhanced performance on complex reasoning tasks. The widespread adoption of this tool-augmented reasoning is hindered by the scarcity of domain-specific tools. For instance, in domains such as physics question answering, suitable and specialized tools are often missing. Recent work has explored automating tool creation by extracting reusable functions from Chain-of-Thought (CoT) reasoning traces; however, these approaches face a critical scalability bottleneck. As the number of generated tools grows, storing them in an unstructured collection leads to significant retrieval challenges, including an expanding search space and ambiguity between function-related tools. To address this, we propose a systematic approach to automatically refactor an unstructured collection of tools into a structured tool library. Our system first generates discrete, task-specific tools and clusters them into semantically coherent topics. Within each cluster, we introduce a multi-agent framework to consolidate scattered functionalities: a code agent refactors code to extract shared logic and creates versatile, aggregated tools, while a reviewing agent ensures that these aggregated tools maintain the complete functional capabilities of the original set. This process transforms numerous question-specific tools into a smaller set of powerful, aggregated tools without loss of functionality. Experimental results demonstrate that our approach significantly improves tool retrieval accuracy and overall reasoning performance across multiple reasoning tasks. Furthermore, our method shows enhanced scalability compared with baselines as the number of question-specific increases.
Abstract（参考訳）: 外部ツールを備えたLarge Language Models (LLM) は、複雑な推論タスクにおけるパフォーマンスの向上を実証している。このツールに強化された推論が広く採用されるのは、ドメイン固有のツールが不足しているためである。例えば、物理学的な質問応答のような領域では、適度で特殊なツールが欠落することが多い。最近の研究は、CoT(Chain-of-Thought)推論トレースから再利用可能な関数を抽出してツール作成を自動化することを検討したが、これらのアプローチは、重要なスケーラビリティのボトルネックに直面している。生成ツールの数が増えるにつれて、それらを構造化されていないコレクションに格納することは、検索スペースの拡大や機能関連のツール間のあいまいさなど、重要な検索課題につながる。そこで本稿では,非構造化ツール群を構造化ツールライブラリに自動リファクタリングする手法を提案する。我々のシステムはまず個別のタスク固有のツールを生成し、それらをセマンティック・コヒーレントなトピックにまとめる。コードエージェントはコードのリファクタリングを行い、共有ロジックを抽出し、汎用的な集約ツールを作成します。一方、レビューエージェントは、これらの集約ツールが元のセットの完全な機能機能を維持することを保証します。このプロセスは、多くの質問固有のツールを、機能を失うことなく、より小さな強力な集約されたツールセットに変換する。実験結果から,本手法は複数の推論タスクにおけるツール検索精度と全体的な推論性能を大幅に向上することが示された。さらに,本手法は,質問特化数が増加するにつれて,ベースラインに比べて拡張性を示す。

論文の概要: ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning

関連論文リスト