Related papers: CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance

CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance

URL: http://arxiv.org/abs/2409.13202v2
Date: Mon, 23 Sep 2024 05:38:24 GMT
Title: CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance
Authors: Yupu Hao, Pengfei Cao, Zhuoran Jin, Huanxuan Liao, Yubo Chen, Kang Liu, Jun Zhao,
Abstract summary: We propose a Component-based Tool-utilizing ability Injection method (CITI) According to the gradient-based importance score of different components, CITI alleviates the capability conflicts caused by fine-tuning process. Experimental results demonstrate that our approach achieves outstanding performance across a range of evaluation metrics.
Score: 17.723293304671877
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tool learning enables the Large Language Models (LLMs) to interact with the external environment by invoking tools, enriching the accuracy and capability scope of LLMs. However, previous works predominantly focus on improving model's tool-utilizing accuracy and the ability to generalize to new, unseen tools, excessively forcing LLMs to adjust specific tool-invoking pattern without considering the harm to model's general performance. This deviates from the actual applications and original intention of integrating tools to enhance model. To tackle this problem, we dissect the capability trade-offs by examining the hidden representation changes and the gradient-based importance score of model's components. Based on the analysis result, we propose a Component Importance-based Tool-utilizing ability Injection method (CITI). According to the gradient-based importance score of different components, it alleviates the capability conflicts caused by fine-tuning process by applying distinct training strategies to different components. CITI applies Mixture-Of-LoRA (MOLoRA) for important components. Meanwhile, it fine-tunes the parameters of few components deemed less important in the backbone of the LLM, while keeping other parameters frozen. CITI can effectively enhance the model's tool-utilizing capability without excessively compromising its general performance. Experimental results demonstrate that our approach achieves outstanding performance across a range of evaluation metrics.

Related papers

ToolRL: Reward is All Tool Learning Needs [54.16305891389931]
Large Language Models (LLMs) often undergo supervised fine-tuning (SFT) to acquire tool use capabilities. Recent advancements in reinforcement learning (RL) have demonstrated promising reasoning and generalization abilities. We present the first comprehensive study on reward design for tool selection and application tasks within the RL paradigm.
arXiv Detail & Related papers (2025-04-16T21:45:32Z)
FamilyTool: A Multi-hop Personalized Tool Use Benchmark [94.1158032740113]
We introduce FamilyTool, a novel benchmark grounded in a family-based knowledge graph (KG) FamilyTool challenges Large Language Models with queries spanning 1 to 3 relational hops. Experiments reveal significant performance gaps in state-of-the-art LLMs.
arXiv Detail & Related papers (2025-04-09T10:42:36Z)
ToolACE-R: Tool Learning with Adaptive Self-Refinement [84.69651852838794]
Tool learning allows Large Language Models to leverage external tools for solving complex user tasks. We propose ToolACE-R, a novel method that introduces adaptive self-refinement for tool invocations. Our results demonstrate the effectiveness of the proposed method, which is compatible with base models of various sizes.
arXiv Detail & Related papers (2025-04-02T06:38:56Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use. MeCo captures high-level cognitive signals in the representation space, guiding when to invoke tools. Our experiments show that MeCo accurately detects LLMs' internal cognitive signals and significantly improves tool-use decision-making.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
Reducing Tool Hallucination via Reliability Alignment [31.761771794788462]
Large Language Models (LLMs) have expanded their capabilities beyond language generation to interact with external tools, enabling automation and real-world applications. Tool hallucinations, where models either select inappropriate tools or misuse them, pose significant challenges, leading to erroneous task execution, increased computational costs, and reduced system reliability. We introduce RelyToolBench, which integrates specialized test cases and novel metrics to assess hallucination-aware task success and efficiency. Finally, we propose Relign, a reliability alignment framework that expands the tool-use action space to include indecisive actions, allowing LLMs to defer tool use, seek clarification, or adjust tool selection
arXiv Detail & Related papers (2024-12-05T13:10:54Z)
Meta-Reasoning Improves Tool Use in Large Language Models [10.193264105560864]
External tools help large language models (LLMs) succeed at tasks where they would otherwise typically fail. We present Tool selECTion via meta-reasONing (TECTON), a two-phase system that first reasons over a task using a custom fine-tuned LM head and outputs candidate tools.
arXiv Detail & Related papers (2024-11-07T08:48:33Z)
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions [60.733557487886635]
This paper focuses on bridging the comprehension gap between Large Language Models and external tools. We propose a novel framework, DRAFT, aimed at Dynamically refining tool documentation. Extensive experiments on multiple datasets demonstrate that DRAFT's iterative, feedback-based refinement significantly ameliorates documentation quality.
arXiv Detail & Related papers (2024-10-10T17:58:44Z)
Learning Evolving Tools for Large Language Models [44.25796648300785]
We propose ToolEVO to enhance the adaptive and reflective capabilities of large language models (LLMs) against tool variability. By leveraging Monte Carlo Tree Search, ToolEVO facilitates active exploration and interaction of LLMs within dynamic environments. We also introduce ToolQA-D, a benchmark specifically designed to evaluate the impact of tool variability.
arXiv Detail & Related papers (2024-10-09T07:14:45Z)
LLM With Tools: A Survey [0.0]
This paper delves into the methodology,challenges, and developments in the realm of teaching LLMs to use external tools. We introduce a standardized paradigm for tool integration guided by a series of functions that map user instructions to actionable plans. Our exploration reveals the various challenges encountered, such as tool invocation timing, selection accuracy, and the need for robust reasoning processes.
arXiv Detail & Related papers (2024-09-24T14:08:11Z)
What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks [33.51887014808227]
This paper explores the impact of both internal and external factors on the performance of tool learning frameworks. We find several insightful conclusions for future work, including the observation that LLMs can benefit significantly from increased trial and exploration.
arXiv Detail & Related papers (2024-07-03T11:06:05Z)
Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models. Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions. We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z)
DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning [75.68193159293425]
In-context learning (ICL) allows transformer-based language models to learn a specific task with a few "task demonstrations" without updating their parameters. We propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL. We experimentally prove the wide applicability of DETAIL by showing our attribution scores obtained on white-box models are transferable to black-box models in improving model performance.
arXiv Detail & Related papers (2024-05-22T15:52:52Z)
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models [10.452149013566157]
LM Transparency Tool (LM-TT) is an open-source interactive toolkit for analyzing the internal workings of Transformer-based language models. It shows the important part of the whole input-to-output information flow.
arXiv Detail & Related papers (2024-04-10T13:39:11Z)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs) It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks. Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
Variable Importance Matching for Causal Inference [73.25504313552516]
We describe a general framework called Model-to-Match that achieves these goals. Model-to-Match uses variable importance measurements to construct a distance metric. We operationalize the Model-to-Match framework with LASSO.
arXiv Detail & Related papers (2023-02-23T00:43:03Z)
Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images. Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.