Related papers: Compositional Generalization for Natural Language Interfaces to Web APIs

Compositional Generalization for Natural Language Interfaces to Web APIs

URL: http://arxiv.org/abs/2112.05209v1
Date: Thu, 9 Dec 2021 20:49:01 GMT
Title: Compositional Generalization for Natural Language Interfaces to Web APIs
Authors: Saghar Hosseini, Ahmed Hassan Awadallah, Yu Su
Abstract summary: This paper presents Okapi, a new dataset for Natural Language to executable web Application Programming Interfaces (NL2API) This dataset is in English and contains 22,508 questions and 9,019 unique API calls, covering three domains. We define new compositional generalization tasks for NL2API which explore the models' ability to extrapolate from simple API calls in the training set to new and more complex API calls in the inference phase.
Score: 26.851998759793453
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents Okapi, a new dataset for Natural Language to executable web Application Programming Interfaces (NL2API). This dataset is in English and contains 22,508 questions and 9,019 unique API calls, covering three domains. We define new compositional generalization tasks for NL2API which explore the models' ability to extrapolate from simple API calls in the training set to new and more complex API calls in the inference phase. Also, the models are required to generate API calls that execute correctly as opposed to the existing approaches which evaluate queries with placeholder values. Our dataset is different than most of the existing compositional semantic parsing datasets because it is a non-synthetic dataset studying the compositional generalization in a low-resource setting. Okapi is a step towards creating realistic datasets and benchmarks for studying compositional generalization alongside the existing datasets and tasks. We report the generalization capabilities of sequence-to-sequence baseline models trained on a variety of the SCAN and Okapi datasets tasks. The best model achieves 15\% exact match accuracy when generalizing from simple API calls to more complex API calls. This highlights some challenges for future research. Okapi dataset and tasks are publicly available at https://aka.ms/nl2api/data.

Related papers

Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation [7.260113022127256]
Large language models (LLMs) are routinely deployed as agentic systems with access to tools that interact with live environments to accomplish tasks.<n>In order to create datasets with such characteristics, we explore how existing NL2 datasets can be used to automatically create NL2API datasets.<n>We apply this pipeline to one of the largest NL2 datasets, BIRD, to create a collection of over 2500 APIs that can be served as invocable tools or REST-endpoints.
arXiv Detail & Related papers (2025-06-12T20:17:52Z)
ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data. We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z)
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets [99.8988504388011]
APIGen is an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets. We release a dataset containing 60,000 high-quality entries, aiming to advance the field of function-calling agent domains.
arXiv Detail & Related papers (2024-06-26T17:49:11Z)
LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs [0.09374652839580183]
In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the information required by the APIs. We leverage logical reasoning and classical AI planning along with an LLM for accurately answering user queries. Our approach achieves over 95% success rate in most cases on a dataset containing complete and incomplete single goal and multi-goal queries.
arXiv Detail & Related papers (2024-05-21T01:16:34Z)
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs [28.840207102132286]
We focus on the task of identifying, curating, and transforming existing datasets. We introduce API-BLEND, a large corpora for training and systematic testing of tool-augmented LLMs. We demonstrate the utility of the API-BLEND dataset for both training and benchmarking purposes.
arXiv Detail & Related papers (2024-02-23T18:30:49Z)
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation [30.466726273695144]
API Pack is a massive multi-programming language dataset containing more than 1 million instruction-API call pairs. By fine-tuning CodeLlama-13B on 20,000 Python instances from API Pack, we enable it to outperform GPT-3.5 and GPT-4 in generating unseen API calls.
arXiv Detail & Related papers (2024-02-14T23:09:15Z)
APICom: Automatic API Completion via Prompt Learning and Adversarial Training-based Data Augmentation [6.029137544885093]
API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs. Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need. Motivated by the neural machine translation research domain, we can model this problem as the generation task. We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
arXiv Detail & Related papers (2023-09-13T15:31:50Z)
Learning to Learn from APIs: Black-Box Data-Free Meta-Learning [95.41441357931397]
Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data. Existing DFML work can only meta-learn from (i) white-box and (ii) small-scale pre-trained models. We propose a Bi-level Data-free Meta Knowledge Distillation (BiDf-MKD) framework to transfer more general meta knowledge from a collection of black-box APIs to one single model.
arXiv Detail & Related papers (2023-05-28T18:00:12Z)
DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions [100.52917027038369]
We operationalize the task of recommending datasets given a short natural language description. To facilitate this task, we build the DataFinder dataset which consists of a larger automatically-constructed training set and a smaller expert-annotated evaluation set. This system, trained on the DataFinder dataset, finds more relevant search results than existing third-party dataset search engines.
arXiv Detail & Related papers (2023-05-26T05:22:36Z)
Evaluating Embedding APIs for Information Retrieval [51.24236853841468]
We evaluate the capabilities of existing semantic embedding APIs on domain generalization and multilingual retrieval. We find that re-ranking BM25 results using the APIs is a budget-friendly approach and is most effective in English. For non-English retrieval, re-ranking still improves the results, but a hybrid model with BM25 works best, albeit at a higher cost.
arXiv Detail & Related papers (2023-05-10T16:40:52Z)
On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc. Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner. Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z)
Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets. We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy. Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.