Related papers: ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers

ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers

URL: http://arxiv.org/abs/2510.19791v1
Date: Wed, 22 Oct 2025 17:26:05 GMT
Title: ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers
Authors: Saptarshi Sengupta, Zhengyu Zhou, Jun Araki, Xingbo Wang, Bingqing Wang, Suhang Wang, Zhe Feng,
Abstract summary: Existing retrieval models rank tools based on the similarity between a user query and a tool description (TD)<n>This leads to suboptimal retrieval as user requests are often poorly aligned with the language of TD.<n>We propose ToolDreamer, a framework to condition retriever models to fetch tools based on hypothetical (synthetic) TD.
Score: 33.08308979741825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tool calling has become increasingly popular for Large Language Models (LLMs). However, for large tool sets, the resulting tokens would exceed the LLM's context window limit, making it impossible to include every tool. Hence, an external retriever is used to provide LLMs with the most relevant tools for a query. Existing retrieval models rank tools based on the similarity between a user query and a tool description (TD). This leads to suboptimal retrieval as user requests are often poorly aligned with the language of TD. To remedy the issue, we propose ToolDreamer, a framework to condition retriever models to fetch tools based on hypothetical (synthetic) TD generated using an LLM, i.e., description of tools that the LLM feels will be potentially useful for the query. The framework enables a more natural alignment between queries and tools within the language space of TD's. We apply ToolDreamer on the ToolRet dataset and show that our method improves the performance of sparse and dense retrievers with and without training, thus showcasing its flexibility. Through our proposed framework, our aim is to offload a portion of the reasoning burden to the retriever so that the LLM may effectively handle a large collection of tools without inundating its context window.

Related papers

Gecko: A Simulation Environment with Stateful Feedback for Refining Agent Tool Calls [56.407063247662336]
We introduce Gecko, a comprehensive environment that simulates tool responses using a combination of rules and LLMs.<n>GATS consistently improves the tool calling performance of various LLMs including GPT-4o, GPT-5, and Gemini-3.0-pro.
arXiv Detail & Related papers (2026-02-22T15:02:00Z)
MassTool: A Multi-Task Search-Based Tool Retrieval Framework for Large Language Models [45.63804847907601]
MassTool is a multi-task search-based framework designed to enhance both query representation and tool retrieval accuracy.<n>It employs a two-tower architecture: a tool usage detection tower that predicts the need for function calls, and a tool retrieval tower that leverages a query-centric graph convolution network (QC-GCN) for effective query-tool matching.<n>By jointly optimizing tool usage detection loss, list-wise retrieval loss, and contrastive regularization loss, MassTool establishes a robust dual-step sequential decision-making pipeline for precise query understanding.
arXiv Detail & Related papers (2025-07-01T07:02:26Z)
Improving Tool Retrieval by Leveraging Large Language Models for Query Generation [16.7926347207647]
In-context learning can provide a short list of relevant tools in the prompt.<n>We propose leveraging Large Language Models (LLMs) to generate a retrieval query.<n>The generated query is embedded and used to find the most relevant tools via a nearest-neighbor search.
arXiv Detail & Related papers (2024-11-17T03:02:09Z)
Efficient and Scalable Estimation of Tool Representations in Vector Space [34.767193045989515]
We present a framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models. We create ToolBank, a new tool retrieval dataset that reflects real human user usages. With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank.
arXiv Detail & Related papers (2024-09-02T19:39:24Z)
Enhancing Tool Retrieval with Iterative Feedback from Large Language Models [9.588592185027455]
Large language models (LLMs) can effectively handle a certain amount of tools through in-context learning or fine-tuning. In real-world scenarios, the number of tools is typically extensive and irregularly updated, emphasizing the necessity for a dedicated tool retrieval component. We propose to enhance tool retrieval with iterative feedback from the large language model.
arXiv Detail & Related papers (2024-06-25T11:12:01Z)
Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models. Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions. We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z)
Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning [57.523454568002144]
Large language models (LLMs) have shown capabilities in commonsense reasoning and leveraging external tools. We introduce ToolRec, a framework for LLM-empowered recommendations via tool learning. We formulate the recommendation process as a process aimed at exploring user interests in attribute granularity. We consider two types of attribute-oriented tools: rank tools and retrieval tools.
arXiv Detail & Related papers (2024-05-24T00:06:54Z)
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning [40.32823306537386]
We propose MLLM-Tool, a system incorporating open-source large language models and multi-modal encoders.<n>Our dataset features multi-modal input tools from HuggingFace.<n>Experiments reveal that our MLLM-Tool is capable of recommending appropriate tools for multi-modal instructions.
arXiv Detail & Related papers (2024-01-19T14:44:37Z)
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction. It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z)
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use [79.87054552116443]
Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities.<n>We introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools.<n>We conduct experiments involving eight popular LLMs and find that the majority of them still struggle to effectively select tools.
arXiv Detail & Related papers (2023-10-04T19:39:26Z)
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [104.37772295581088]
Open-source large language models (LLMs), e.g., LLaMA, remain significantly limited in tool-use capabilities. We introduce ToolLLM, a general tool-usetuning encompassing data construction, model training, and evaluation. We first present ToolBench, an instruction-tuning framework for tool use, which is constructed automatically using ChatGPT.
arXiv Detail & Related papers (2023-07-31T15:56:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.