Related papers: Octopus: On-device language model for function calling of software APIs

Octopus: On-device language model for function calling of software APIs

URL: http://arxiv.org/abs/2404.01549v1
Date: Tue, 2 Apr 2024 01:29:28 GMT
Title: Octopus: On-device language model for function calling of software APIs
Authors: Wei Chen, Zhiyuan Li, Mingyuan Ma,
Abstract summary: Large Language Models (LLMs) play a crucial role due to their advanced text processing and generation abilities. This study introduces a new strategy aimed at harnessing on-device LLMs in invoking software APIs.
Score: 9.78611123915888
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In the rapidly evolving domain of artificial intelligence, Large Language Models (LLMs) play a crucial role due to their advanced text processing and generation abilities. This study introduces a new strategy aimed at harnessing on-device LLMs in invoking software APIs. We meticulously compile a dataset derived from software API documentation and apply fine-tuning to LLMs with capacities of 2B, 3B and 7B parameters, specifically to enhance their proficiency in software API interactions. Our approach concentrates on refining the models' grasp of API structures and syntax, significantly enhancing the accuracy of API function calls. Additionally, we propose \textit{conditional masking} techniques to ensure outputs in the desired formats and reduce error rates while maintaining inference speeds. We also propose a novel benchmark designed to evaluate the effectiveness of LLMs in API interactions, establishing a foundation for subsequent research. Octopus, the fine-tuned model, is proved to have better performance than GPT-4 for the software APIs calling. This research aims to advance automated software development and API integration, representing substantial progress in aligning LLM capabilities with the demands of practical software engineering applications.

Related papers

Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs [63.10710876536337]
We propose an offline simulation framework to curate a software-specific skillset, a collection of verified scripts. Our framework comprises two components: (1) task creation, using top-down functionality and bottom-up API synergy exploration to generate helpful tasks. Experiments with Adobe Illustrator demonstrate that our framework significantly improves automation success rates, reduces response time, and saves runtime token costs.
arXiv Detail & Related papers (2025-04-29T04:03:37Z)
CallNavi, A Challenge and Empirical Study on LLM Function Calling and Routing [7.443502461016052]
This work contributes to the evaluation and assessment of AI-based software development. Novel benchmarking specifically designed for API function selection, parameter generation, and nested API execution. An empirical evaluation of state-of-the-art language models, analyzing their performance. A hybrid approach to API routing, combining general-purpose large language models for API selection with fine-tuned models and prompt engineering.
arXiv Detail & Related papers (2025-01-09T14:12:43Z)
Creating an LLM-based AI-agent: A high-level methodology towards enhancing LLMs with APIs [0.0]
Large Language Models (LLMs) have revolutionized various aspects of engineering and science. This thesis serves as a comprehensive guide that elucidates a multi-faceted approach for empowering LLMs with the capability to leverage Application Programming Interfaces (APIs) We propose an on-device architecture that aims to exploit the functionality of carry-on devices by using small models from the Hugging Face community.
arXiv Detail & Related papers (2024-12-17T14:14:04Z)
ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution. We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
Demystifying Application Programming Interfaces (APIs): Unlocking the Power of Large Language Models and Other Web-based AI Services in Social Work Research [0.0]
Application Programming Interfaces (APIs) are essential tools for social work researchers aiming to harness advanced technologies like Large Language Models (LLMs) and other AI services. This paper demystifies APIs and illustrates how they can enhance research methodologies. Practical code examples demonstrate how LLMs can generate API code for accessing specialized services, such as extracting data from unstructured text.
arXiv Detail & Related papers (2024-10-26T16:07:12Z)
AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation [16.590226868986296]
AutoFeedback is a framework for efficient and accurate API request generation. It implements two feedback loops during the process of generating API requests by the Large Language Models. It achieves an accuracy of 100.00% on a real-world API dataset and reduces the cost of interaction with GPT-3.5 Turbo by 23.44%, and GPT-4 Turbo by 11.85%.
arXiv Detail & Related papers (2024-10-09T14:38:28Z)
Harnessing LLMs for API Interactions: A Framework for Classification and Synthetic Data Generation [0.0]
We propose a novel system that integrates Large Language Models (LLMs) for both classifying natural language inputs into corresponding API calls. Our system allows users to invoke complex software functionalities through simple inputs, improving interaction efficiency and lowering the barrier to software utilization.
arXiv Detail & Related papers (2024-09-18T04:56:52Z)
ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data. We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z)
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability. Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z)
Semantic API Alignment: Linking High-level User Goals to APIs [6.494714497852088]
We present a vision to span multiple steps from requirements engineering to implementation using existing libraries. This approach, which we call Semantic API Alignment (SEAL), aims to bridge the gap between a user's high-level goals and the specific functions of one or more APIs.
arXiv Detail & Related papers (2024-05-07T11:54:32Z)
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs [28.840207102132286]
We focus on the task of identifying, curating, and transforming existing datasets. We introduce API-BLEND, a large corpora for training and systematic testing of tool-augmented LLMs. We demonstrate the utility of the API-BLEND dataset for both training and benchmarking purposes.
arXiv Detail & Related papers (2024-02-23T18:30:49Z)
LM-Polygraph: Uncertainty Estimation for Language Models [71.21409522341482]
Uncertainty estimation (UE) methods are one path to safer, more responsible, and more effective use of large language models (LLMs) We introduce LM-Polygraph, a framework with implementations of a battery of state-of-the-art UE methods for LLMs in text generation tasks, with unified program interfaces in Python. It introduces an extendable benchmark for consistent evaluation of UE techniques by researchers, and a demo web application that enriches the standard chat dialog with confidence scores.
arXiv Detail & Related papers (2023-11-13T15:08:59Z)
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [104.37772295581088]
Open-source large language models (LLMs), e.g., LLaMA, remain significantly limited in tool-use capabilities. We introduce ToolLLM, a general tool-usetuning encompassing data construction, model training, and evaluation. We first present ToolBench, an instruction-tuning framework for tool use, which is constructed automatically using ChatGPT.
arXiv Detail & Related papers (2023-07-31T15:56:53Z)
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing. LLMs are extremely computationally expensive, even at inference time. We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.