Related papers: Reverse Chain: A Generic-Rule for LLMs to Master Multi-API Planning

Reverse Chain: A Generic-Rule for LLMs to Master Multi-API Planning

URL: http://arxiv.org/abs/2310.04474v3
Date: Thu, 22 Feb 2024 09:53:02 GMT
Title: Reverse Chain: A Generic-Rule for LLMs to Master Multi-API Planning
Authors: Yinger Zhang, Hui Cai, Xeirui Song, Yicheng Chen, Rui Sun, Jing Zheng
Abstract summary: This paper introduces Reverse Chain'', a controllable, target-driven approach to empower Large Language Models with the capability to operate external APIs only via prompts. To manage a controllable multi-function calling, Reverse Chain adopts a generic rule based on a backward reasoning process.
Score: 8.96245399645571
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While enabling large language models to implement function calling (known as APIs) can greatly enhance the performance of Large Language Models (LLMs), function calling is still a challenging task due to the complicated relations between different APIs, especially in a context-learning setting without fine-tuning. This paper introduces ``Reverse Chain'', a controllable, target-driven approach designed to empower LLMs with the capability to operate external APIs only via prompts. Recognizing that most LLMs have limited tool-use capabilities, Reverse Chain limits LLMs to executing simple tasks, e.g., API Selection and Argument Completion. Furthermore, to manage a controllable multi-function calling, Reverse Chain adopts a generic rule based on a backward reasoning process. This rule determines when to do API selection or Argument completion. To evaluate the multi-tool-use capability of LLMs, we have released a compositional multi-tool task dataset, available at \url{https://anonymous.4open.science/r/reverse-chain-8681}. Extensive numerical experiments validate the remarkable proficiency of Reverse Chain in managing multiple API calls.

Related papers

LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE) LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients. It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z)
ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution. We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making. Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance. We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z)
AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction [24.67142048995415]
Large Language Models (LLMs) can interact with the real world by connecting with versatile external APIs. We introduce textttAppBench, the first benchmark to evaluate LLMs' ability to plan and execute multiple APIs from various sources.
arXiv Detail & Related papers (2024-10-10T04:03:13Z)
NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls [18.831512738668792]
We present NESTFUL, a benchmark to evaluate large language models (LLMs) on nested sequences of API calls. Our results show that most models do not perform well on nested APIs in NESTFUL as compared to their performance on the simpler problem settings available in existing benchmarks.
arXiv Detail & Related papers (2024-09-04T17:53:24Z)
Plan with Code: Comparing approaches for robust NL to DSL generation [0.0]
Planning in code is considered a more reliable approach for many orchestration tasks. This paper focuses on workflow automation in RPA (Robotic Process Automation) domain as a special case of task planning.
arXiv Detail & Related papers (2024-08-15T04:29:33Z)
Open-domain Implicit Format Control for Large Language Model Generation [52.83173553689678]
We introduce a novel framework for controlled generation in large language models (LLMs) This study investigates LLMs' capabilities to follow open-domain, one-shot constraints and replicate the format of the example answers. We also develop a dataset collection methodology for supervised fine-tuning that enhances the open-domain format control of LLMs without degrading output quality.
arXiv Detail & Related papers (2024-08-08T11:51:45Z)
LLM+Reasoning+Planning for Supporting Incomplete User Queries in Presence of APIs [0.09374652839580183]
In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the information required by the APIs. While Large Language Models (LLMs) excel at natural language processing (NLP) tasks, they frequently hallucinate on missing information or struggle with orchestrating the APIs.
arXiv Detail & Related papers (2024-05-21T01:16:34Z)
An LLM-Tool Compiler for Fused Parallel Function Calling [1.990293258268139]
State-of-the-art sequential reasoning in Large Language Models (LLMs) has expanded the capabilities of Copilots beyond conversational tasks to complex function calling. We propose LLM-Tool Compiler, which fuses similar types of tool operations under a single function at runtime, presenting them as a unified task to the LLM. Benchmarked on a large-scale Copilot platform, LLM-Tool Compiler achieves up to four times more parallel calls than existing methods, reducing token costs and latency by up to 40% and 12%, respectively.
arXiv Detail & Related papers (2024-05-07T18:55:50Z)
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs. We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer. This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z)
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion [96.47420221442397]
We introduce the PowerPoint Task Completion benchmark to assess the ability of Large Language Models to finish multi-turn, multi-modal instructions. We also propose the PPTX-Match Evaluation System that evaluates if LLMs finish the instruction based on the prediction file rather than the label API sequence. The results show that GPT-4 outperforms other LLMs with 75.1% accuracy in single-turn dialogue testing but faces challenges in completing entire sessions, achieving just 6% session accuracy.
arXiv Detail & Related papers (2023-11-03T08:06:35Z)
Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES. Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query. By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z)
LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.