Related papers: Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists

Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists

URL: http://arxiv.org/abs/2506.00042v1
Date: Wed, 28 May 2025 08:39:35 GMT
Title: Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists
Authors: Yue Cui, Liuyi Yao, Shuchang Tao, Weijie Shi, Yaliang Li, Bolin Ding, Xiaofang Zhou,
Abstract summary: We propose the Hierarchical Tool Error Checklist (HiTEC) framework to mitigate and diagnose tool-calling errors.<n>HiTEC introduces a two-tiered approach: a global error checklist that identifies common, cross-tool issues, and a local error checklist that targets tool-specific and contextual failures.<n>Our framework significantly improves parameter-filling accuracy and tool-calling success rates compared to baseline methods.
Score: 49.450160442348825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have significantly advanced natural language processing, particularly through the integration of external tools and APIs. However, their effectiveness is frequently hampered by parameter mis-filling during tool calling. In this paper, we propose the Hierarchical Tool Error Checklist (HiTEC) framework to systematically diagnose and mitigate tool-calling errors without relying on extensive real-world interactions. HiTEC introduces a two-tiered approach: a global error checklist that identifies common, cross-tool issues, and a local error checklist that targets tool-specific and contextual failures. Building on this structure, we propose two deployments: HiTEC-In Context Learning (HiTEC-ICL) and HiTEC-Kahneman-Tversky Optimization (HiTEC-KTO). HiTEC-ICL embeds the global checklist in the initial prompts and leverages a two-round conversational interaction to dynamically refine parameter handling, while HiTEC-KTO generates high-quality negative examples to drive fine-tuning via preference-based optimization. Extensive experiments across five public datasets demonstrate that our framework significantly improves parameter-filling accuracy and tool-calling success rates compared to baseline methods.

Related papers

TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation [27.71948796412585]
Token-level Tool-use Preference Alignment Training Framework (TTPA)<n>TTPA is a training paradigm for constructing token-level tool-use preference datasets.
arXiv Detail & Related papers (2025-05-26T14:06:02Z)
Enhancing Large Language Models with Faster Code Preprocessing for Vulnerability Detection [0.0]
We build on the existing SCoPE framework and introduce SCoPE2, an enhanced version with improved performance.<n>Our results show a 97.3% reduction in processing time with SCoPE2, along with an improved F1-score for the Large Language Model (LLM) for vulnerability detection.
arXiv Detail & Related papers (2025-05-08T19:00:11Z)
Acting Less is Reasoning More! Teaching Model to Act Efficiently [87.28134636548705]
Tool-integrated reasoning augments large language models with the ability to invoke external tools to solve tasks.<n>Current approaches typically optimize only for final correctness without considering the efficiency or necessity of external tool use.<n>We propose a framework that encourages models to produce accurate answers with minimal tool calls.<n>Our approach reduces tool calls by up to 68.3% and improves tool productivity by up to 215.4%, while maintaining comparable answer accuracy.
arXiv Detail & Related papers (2025-04-21T05:40:05Z)
AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification [25.27444694706659]
We present AskToAct, which exploits structural mapping between queries and their tool invocation solutions.<n>Our key insight is that tool parameters naturally represent explicit user intents.<n>By systematically removing key parameters from queries while retaining them as ground truth, we enable automated construction of high-quality training data.
arXiv Detail & Related papers (2025-03-03T12:55:49Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo captures high-level cognitive signals in the representation space, guiding when to invoke tools.<n>Our experiments show that MeCo accurately detects LLMs' internal cognitive signals and significantly improves tool-use decision-making.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
Rephrase and Contrast: Fine-Tuning Language Models for Enhanced Understanding of Communication and Computer Networks [13.829525575305206]
This paper introduces our Rephrase and Contrast (RaC) framework, an efficient fine-tuning framework. RaC enhances LLMs' comprehension and critical thinking abilities by incorporating question reformulation and contrastive analysis. To efficiently construct the dataset for RaC fine-tuning, we develop a GPT-assisted data mining method for generating high-quality question-answer pairs.
arXiv Detail & Related papers (2024-09-21T16:04:43Z)
Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards [4.334100270812517]
Large language models (LLMs) struggle with technical standards in telecommunications.<n>We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 small language model (SLM)<n>Our experiments demonstrate substantial improvements over existing question-answering approaches in the telecom domain.
arXiv Detail & Related papers (2024-08-21T17:00:05Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
Correct Like Humans: Progressive Learning Framework for Chinese Text Error Correction [28.25789161365667]
Chinese Text Error Correction (CTEC) aims to detect and correct errors in the input text. Recent approaches mainly employ Pre-trained Language Models (PLMs) to resolve CTEC. We propose a novel model-agnostic progressive learning framework, named ProTEC, which guides PLMs-based CTEC models to learn to correct like humans.
arXiv Detail & Related papers (2023-06-30T07:44:49Z)
Optimizing Two-way Partial AUC with an End-to-end Framework [154.47590401735323]
Area Under the ROC Curve (AUC) is a crucial metric for machine learning. Recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics. We present the first trial in this paper to optimize this new metric.
arXiv Detail & Related papers (2022-06-23T12:21:30Z)
High Dimensional Level Set Estimation with Bayesian Neural Network [58.684954492439424]
This paper proposes novel methods to solve the high dimensional Level Set Estimation problems using Bayesian Neural Networks. For each problem, we derive the corresponding theoretic information based acquisition function to sample the data points. Numerical experiments on both synthetic and real-world datasets show that our proposed method can achieve better results compared to existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-17T23:21:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.