ToolFactory: Automating Tool Generation by Leveraging LLM to Understand REST API Documentations
- URL: http://arxiv.org/abs/2501.16945v1
- Date: Tue, 28 Jan 2025 13:42:33 GMT
- Title: ToolFactory: Automating Tool Generation by Leveraging LLM to Understand REST API Documentations
- Authors: Xinyi Ni, Qiuyang Wang, Yukun Zhang, Pengyu Hong,
- Abstract summary: API documentation often suffers from a lack of standardization, inconsistent schemas, and incomplete information.
We developed textbfToolFactory, an open-source pipeline for automating tool generation from unstructured API documents.
We also demonstrated ToolFactory by creating a domain-specific AI agent for glycomaterials research.
- Score: 4.934192277899036
- License:
- Abstract: LLM-based tool agents offer natural language interfaces, enabling users to seamlessly interact with computing services. While REST APIs are valuable resources for building such agents, they must first be transformed into AI-compatible tools. Automatically generating AI-compatible tools from REST API documents can greatly streamline tool agent development and minimize user learning curves. However, API documentation often suffers from a lack of standardization, inconsistent schemas, and incomplete information. To address these issues, we developed \textbf{ToolFactory}, an open-source pipeline for automating tool generation from unstructured API documents. To enhance the reliability of the developed tools, we implemented an evaluation method to diagnose errors. Furthermore, we built a knowledge base of verified tools, which we leveraged to infer missing information from poorly documented APIs. We developed the API Extraction Benchmark, comprising 167 API documents and 744 endpoints in various formats, and designed a JSON schema to annotate them. This annotated dataset was utilized to train and validate ToolFactory. The experimental results highlight the effectiveness of ToolFactory. We also demonstrated ToolFactory by creating a domain-specific AI agent for glycomaterials research. ToolFactory exhibits significant potential for facilitating the seamless integration of scientific REST APIs into AI workflows.
Related papers
- APITestGenie: Automated API Test Generation through Generative AI [2.0716352593701277]
APITestGenie generates executable API test scripts from business requirements and API specifications.
In experiments with 10 real-world APIs, the tool generated valid test scripts 57% of the time.
Human intervention is recommended to validate or refine generated scripts before integration into CI/CD pipelines.
arXiv Detail & Related papers (2024-09-05T18:02:41Z) - Efficient and Scalable Estimation of Tool Representations in Vector Space [34.767193045989515]
We present a framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models.
We create ToolBank, a new tool retrieval dataset that reflects real human user usages.
With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank.
arXiv Detail & Related papers (2024-09-02T19:39:24Z) - ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data.
We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z) - Learning to Ask: When LLM Agents Meet Unclear Instruction [55.65312637965779]
Large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone.
We evaluate the performance of LLMs tool-use under imperfect instructions, analyze the error patterns, and build a challenging tool-use benchmark called Noisy ToolBench.
We propose a novel framework, Ask-when-Needed (AwN), which prompts LLMs to ask questions to users whenever they encounter obstacles due to unclear instructions.
arXiv Detail & Related papers (2024-08-31T23:06:12Z) - StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models [74.88844320554284]
We introduce StableToolBench, a benchmark evolving from ToolBench.
The virtual API server contains a caching system and API simulators which are complementary to alleviate the change in API status.
The stable evaluation system designs solvable pass and win rates using GPT-4 as the automatic evaluator to eliminate the randomness during evaluation.
arXiv Detail & Related papers (2024-03-12T14:57:40Z) - API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs [28.840207102132286]
We focus on the task of identifying, curating, and transforming existing datasets.
We introduce API-BLEND, a large corpora for training and systematic testing of tool-augmented LLMs.
We demonstrate the utility of the API-BLEND dataset for both training and benchmarking purposes.
arXiv Detail & Related papers (2024-02-23T18:30:49Z) - EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction.
It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
APIs [104.37772295581088]
Open-source large language models (LLMs), e.g., LLaMA, remain significantly limited in tool-use capabilities.
We introduce ToolLLM, a general tool-usetuning encompassing data construction, model training, and evaluation.
We first present ToolBench, an instruction-tuning framework for tool use, which is constructed automatically using ChatGPT.
arXiv Detail & Related papers (2023-07-31T15:56:53Z) - ToolCoder: Teach Code Generation Models to use API search tools [44.370920906850024]
We propose ToolCoder, a novel approach that integrates API search tools with existing models to assist in code generation and API selection.
Our experimental results demonstrate that ToolCoder exhibits excellent performance and generalization across five public and private library code generation benchmarks.
arXiv Detail & Related papers (2023-05-06T12:45:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.