Fugu-MT 論文翻訳(概要): Tools are under-documented: Simple Document Expansion Boosts Tool Retrieval

論文の概要: Tools are under-documented: Simple Document Expansion Boosts Tool Retrieval

arxiv url: http://arxiv.org/abs/2510.22670v1
Date: Sun, 26 Oct 2025 13:17:01 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-28 15:28:15.317447
Title: Tools are under-documented: Simple Document Expansion Boosts Tool Retrieval
Title（参考訳）: ツールはまだ文書化されていない: シンプルなドキュメント拡張ブーストツール検索ツール
Authors: Xuan Lu, Haohang Huang, Rui Meng, Yaohui Jin, Wenjun Zeng, Xiaoyu Shen,
Abstract要約: 大規模言語モデル(LLM)は、最近、ツールの使用において強力な能力を示しているが、ツール検索の進歩は、不完全で異質なツールドキュメントによって妨げられている。我々は、より効果的なツール検索を可能にするために、構造化されたフィールドでツールドキュメントを体系的に強化する新しいベンチマークとフレームワークであるTool-DEを紹介する。ツール検索に適した2つのモデルを開発する。ツールエンベッド(Tool-Embed)とツールランク(Tool-Rank)である。
参考スコア（独自算出の注目度）: 36.93384080571354
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Models (LLMs) have recently demonstrated strong capabilities in tool use, yet progress in tool retrieval remains hindered by incomplete and heterogeneous tool documentation. To address this challenge, we introduce Tool-DE, a new benchmark and framework that systematically enriches tool documentation with structured fields to enable more effective tool retrieval, together with two dedicated models, Tool-Embed and Tool-Rank. We design a scalable document expansion pipeline that leverages both open- and closed-source LLMs to generate, validate, and refine enriched tool profiles at low cost, producing large-scale corpora with 50k instances for embedding-based retrievers and 200k for rerankers. On top of this data, we develop two models specifically tailored for tool retrieval: Tool-Embed, a dense retriever, and Tool-Rank, an LLM-based reranker. Extensive experiments on ToolRet and Tool-DE demonstrate that document expansion substantially improves retrieval performance, with Tool-Embed and Tool-Rank achieving new state-of-the-art results on both benchmarks. We further analyze the contribution of individual fields to retrieval effectiveness, as well as the broader impact of document expansion on both training and evaluation. Overall, our findings highlight both the promise and limitations of LLM-driven document expansion, positioning Tool-DE, along with the proposed Tool-Embed and Tool-Rank, as a foundation for future research in tool retrieval.
Abstract（参考訳）: 大規模言語モデル(LLM)は、最近、ツールの使用において強力な能力を示しているが、ツール検索の進歩は、不完全で異質なツールドキュメントによって妨げられている。この課題に対処するため、我々は、Tool-EmbedとTool-Rankという2つの専用モデルとともに、より効果的なツール検索を可能にするため、構造化されたフィールドでツールドキュメントを体系的に強化する新しいベンチマークとフレームワークであるTool-DEを紹介した。我々は,オープンソースのLLMとクローズドソースの両方を活用して,拡張ツールプロファイルを低コストで生成,検証,洗練するスケーラブルな文書拡張パイプラインを設計し,埋め込み型検索用50kインスタンスと再ランカ用200kインスタンスの大規模コーパスを作成した。このデータに基づいて,ツール検索に適したツールEmbedと,LLMベースのリランカであるTool-Rankの2つのモデルを開発した。 ToolRetとTool-DEに関する大規模な実験は、ドキュメント拡張が検索性能を大幅に改善することを示し、Tool-EmbedとTool-Rankは両方のベンチマークで新しい最先端の結果を達成する。さらに、個々の分野の検索効率への貢献と、文書拡張がトレーニングと評価の両方に与える影響について分析する。ツール検索の基盤となるツール-EmbedとTool-Rankとともに,LLM駆動型ドキュメント拡張の約束と限界の両方を強調した。

論文の概要: Tools are under-documented: Simple Document Expansion Boosts Tool Retrieval

関連論文リスト