Advancing and Benchmarking Personalized Tool Invocation for LLMs
- URL: http://arxiv.org/abs/2505.04072v1
- Date: Wed, 07 May 2025 02:25:20 GMT
- Title: Advancing and Benchmarking Personalized Tool Invocation for LLMs
- Authors: Xu Huang, Yuefeng Huang, Weiwen Liu, Xingshan Zeng, Yasheng Wang, Ruiming Tang, Hong Xie, Defu Lian,
- Abstract summary: We introduce the concept of Personalized Tool Invocation and define two key tasks: Tool Preference and Profile-dependent Query.<n>To tackle these challenges, we propose PTool, a data synthesis framework designed for personalized tool invocation.<n>We construct textbfPTBench, the first benchmark for evaluating personalized tool invocation.
- Score: 66.39214525683425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tool invocation is a crucial mechanism for extending the capabilities of Large Language Models (LLMs) and has recently garnered significant attention. It enables LLMs to solve complex problems through tool calls while accessing up-to-date world knowledge. However, existing work primarily focuses on the fundamental ability of LLMs to invoke tools for problem-solving, without considering personalized constraints in tool invocation. In this work, we introduce the concept of Personalized Tool Invocation and define two key tasks: Tool Preference and Profile-dependent Query. Tool Preference addresses user preferences when selecting among functionally similar tools, while Profile-dependent Query considers cases where a user query lacks certain tool parameters, requiring the model to infer them from the user profile. To tackle these challenges, we propose PTool, a data synthesis framework designed for personalized tool invocation. Additionally, we construct \textbf{PTBench}, the first benchmark for evaluating personalized tool invocation. We then fine-tune various open-source models, demonstrating the effectiveness of our framework and providing valuable insights. Our benchmark is public at https://github.com/hyfshadow/PTBench.
Related papers
- TAPS: Tool-Augmented Personalisation via Structured Tagging [0.7007504690449126]
This work investigates how user preferences can be effectively integrated into goal-oriented dialogue agents.<n>We introduce TAPS, a novel solution that enhances personalised tool use by leveraging a structured tagging tool and an uncertainty-based tool detector.
arXiv Detail & Related papers (2025-06-25T13:24:46Z) - ToolSpectrum : Towards Personalized Tool Utilization for Large Language Models [48.276461194773354]
We introduce ToolSpectrum, a benchmark designed to evaluate large language models' capabilities in personalized tool utilization.<n>We formalize two key dimensions of personalization, user profile and environmental factors, and analyze their individual and synergistic impacts on tool utilization.<n>Our findings underscore the necessity of context-aware personalization in tool-augmented LLMs and reveal critical limitations for current models.
arXiv Detail & Related papers (2025-05-19T14:30:46Z) - PEToolLLM: Towards Personalized Tool Learning in Large Language Models [21.800332388883465]
We formulate the task of personalized tool learning, which integrates user's interaction history towards personalized tool usage.<n>We construct PEToolBench, featuring diverse user preferences reflected in interaction history under three distinct personalized settings.<n>We propose a framework PEToolLLaMA to adapt LLMs to the personalized tool learning task.
arXiv Detail & Related papers (2025-02-26T09:43:08Z) - Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z) - PTR: Precision-Driven Tool Recommendation for Large Language Models [43.53494041932615]
We propose a Precision-driven Tool Recommendation (PTR) approach for Large Language Models (LLMs)
PTR captures an initial, concise set of tools by leveraging historical tool bundle usage and dynamically adjusts the tool set by performing tool matching.
We present a new dataset, RecTools, and a metric, TRACC, designed to evaluate the effectiveness of tool recommendation for LLMs.
arXiv Detail & Related papers (2024-11-14T17:33:36Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use [79.87054552116443]
Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities.<n>We introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools.<n>We conduct experiments involving eight popular LLMs and find that the majority of them still struggle to effectively select tools.
arXiv Detail & Related papers (2023-10-04T19:39:26Z) - Large Language Models as Tool Makers [85.00361145117293]
We introduce a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.
Our approach consists of two phases: 1) tool making: an LLM acts as the tool maker that crafts tools for a set of tasks. 2) tool using: another LLM acts as the tool user, which applies the tool built by the tool maker for problem-solving.
arXiv Detail & Related papers (2023-05-26T17:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.