Related papers: AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation

AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation

URL: http://arxiv.org/abs/2410.06943v1
Date: Wed, 9 Oct 2024 14:38:28 GMT
Title: AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation
Authors: Huanxi Liu, Jiaqi Liao, Dawei Feng, Kele Xu, Huaimin Wang,
Abstract summary: AutoFeedback is a framework for efficient and accurate API request generation. It implements two feedback loops during the process of generating API requests by the Large Language Models. It achieves an accuracy of 100.00% on a real-world API dataset and reduces the cost of interaction with GPT-3.5 Turbo by 23.44%, and GPT-4 Turbo by 11.85%.
Score: 16.590226868986296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) leverage external tools primarily through generating the API request to enhance task completion efficiency. The accuracy of API request generation significantly determines the capability of LLMs to accomplish tasks. Due to the inherent hallucinations within the LLM, it is difficult to efficiently and accurately generate the correct API request. Current research uses prompt-based feedback to facilitate the LLM-based API request generation. However, existing methods lack factual information and are insufficiently detailed. To address these issues, we propose AutoFeedback, an LLM-based framework for efficient and accurate API request generation, with a Static Scanning Component (SSC) and a Dynamic Analysis Component (DAC). SSC incorporates errors detected in the API requests as pseudo-facts into the feedback, enriching the factual information. DAC retrieves information from API documentation, enhancing the level of detail in feedback. Based on this two components, Autofeedback implementes two feedback loops during the process of generating API requests by the LLM. Extensive experiments demonstrate that it significantly improves accuracy of API request generation and reduces the interaction cost. AutoFeedback achieves an accuracy of 100.00\% on a real-world API dataset and reduces the cost of interaction with GPT-3.5 Turbo by 23.44\%, and GPT-4 Turbo by 11.85\%.

Related papers

APIRAT: Integrating Multi-source API Knowledge for Enhanced Code Translation with LLMs [6.522570957351905]
APIRAT is a novel code translation method that integrates multi-source API knowledge. APIRAT employs three API knowledge augmentation techniques, including API sequence retrieval, API sequence back-translation, and API mapping. Experiments indicate that APIRAT significantly surpasses existing LLM-based methods, achieving improvements in computational accuracy ranging from 4% to 15.1%.
arXiv Detail & Related papers (2025-04-21T04:24:49Z)
Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests. We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments. We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z)
ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution. We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
SEAL: Suite for Evaluating API-use of LLMs [1.2528321519119252]
SEAL is an end-to-end testbed designed to evaluate large language models in real-world API usage. It standardizes existing benchmarks, integrates an agent system for testing API retrieval and planning, and addresses the instability of real-time APIs.
arXiv Detail & Related papers (2024-09-23T20:16:49Z)
A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development. Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z)
ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data. We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z)
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability. Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z)
On Mitigating Code LLM Hallucinations with API Documentation [22.933186524255593]
We introduce CloudAPIBench, a new benchmark designed to measure API hallucination occurrences. We demonstrate that our proposed methods enhance the balance between low and high frequency API performance.
arXiv Detail & Related papers (2024-07-13T00:16:26Z)
A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking. It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence. Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z)
Octopus: On-device language model for function calling of software APIs [9.78611123915888]
Large Language Models (LLMs) play a crucial role due to their advanced text processing and generation abilities. This study introduces a new strategy aimed at harnessing on-device LLMs in invoking software APIs.
arXiv Detail & Related papers (2024-04-02T01:29:28Z)
Compositional API Recommendation for Library-Oriented Code Generation [23.355509276291198]
We propose CAPIR, which adopts a "divide-and-conquer" strategy to recommend APIs for coarse-grained requirements. We present two challenging benchmarks, RAPID (Recommend APIs based on Documentation) and LOCG (Library-Oriented Code Generation) Experimental results on these benchmarks, demonstrate the effectiveness of CAPIR in comparison to existing baselines.
arXiv Detail & Related papers (2024-02-29T18:27:27Z)
APICom: Automatic API Completion via Prompt Learning and Adversarial Training-based Data Augmentation [6.029137544885093]
API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs. Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need. Motivated by the neural machine translation research domain, we can model this problem as the generation task. We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
arXiv Detail & Related papers (2023-09-13T15:31:50Z)
Adaptive REST API Testing with Reinforcement Learning [54.68542517176757]
Current testing tools lack efficient exploration mechanisms, treating all operations and parameters equally. Current tools struggle when response schemas are absent in the specification or exhibit variants. We present an adaptive REST API testing technique incorporates reinforcement learning to prioritize operations during exploration.
arXiv Detail & Related papers (2023-09-08T20:27:05Z)
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs [84.45284695156771]
API-Bank is a groundbreaking benchmark for tool-augmented Large Language Models. We develop a run evaluation system consisting of 73 API tools. We construct a comprehensive training set containing 1,888 tool-use dialogues from 2,138 APIs spanning 1,000 distinct domains.
arXiv Detail & Related papers (2023-04-14T14:05:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.