APICom: Automatic API Completion via Prompt Learning and Adversarial
Training-based Data Augmentation
- URL: http://arxiv.org/abs/2309.07026v1
- Date: Wed, 13 Sep 2023 15:31:50 GMT
- Title: APICom: Automatic API Completion via Prompt Learning and Adversarial
Training-based Data Augmentation
- Authors: Yafeng Gu, Yiheng Shen, Xiang Chen, Shaoyu Yang, Yiling Huang,
Zhixiang Cao
- Abstract summary: API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs.
Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need.
Motivated by the neural machine translation research domain, we can model this problem as the generation task.
We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
- Score: 6.029137544885093
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Based on developer needs and usage scenarios, API (Application Programming
Interface) recommendation is the process of assisting developers in finding the
required API among numerous candidate APIs. Previous studies mainly modeled API
recommendation as the recommendation task, which can recommend multiple
candidate APIs for the given query, and developers may not yet be able to find
what they need. Motivated by the neural machine translation research domain, we
can model this problem as the generation task, which aims to directly generate
the required API for the developer query. After our preliminary investigation,
we find the performance of this intuitive approach is not promising. The reason
is that there exists an error when generating the prefixes of the API. However,
developers may know certain API prefix information during actual development in
most cases. Therefore, we model this problem as the automatic completion task
and propose a novel approach APICom based on prompt learning, which can
generate API related to the query according to the prompts (i.e., API prefix
information). Moreover, the effectiveness of APICom highly depends on the
quality of the training dataset. In this study, we further design a novel
gradient-based adversarial training method {\atpart} for data augmentation,
which can improve the normalized stability when generating adversarial
examples. To evaluate the effectiveness of APICom, we consider a corpus of 33k
developer queries and corresponding APIs. Compared with the state-of-the-art
baselines, our experimental results show that APICom can outperform all
baselines by at least 40.02\%, 13.20\%, and 16.31\% in terms of the performance
measures EM@1, MRR, and MAP. Finally, our ablation studies confirm the
effectiveness of our component setting (such as our designed adversarial
training method, our used pre-trained model, and prompt learning) in APICom.
Related papers
- A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment [49.00213183302225]
We propose a framework to induce new APIs by grounding wikiHow instruction to situated agent policies.
Inspired by recent successes in large language models (LLMs) for embodied planning, we propose a few-shot prompting to steer GPT-4.
arXiv Detail & Related papers (2024-07-10T15:52:44Z) - A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking.
It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.
Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - Contextual API Completion for Unseen Repositories Using LLMs [6.518508607788089]
We introduce a novel technique to mitigate hallucinations by leveraging global and local contextual information within a code repository for API completion tasks.
Our approach is tailored to refine code completion tasks, with a focus on optimizing local API completions.
Our tool, LANCE, surpasses Copilot by 143% and 142% for API token completion and conversational API completion, respectively.
arXiv Detail & Related papers (2024-05-07T18:22:28Z) - APIGen: Generative API Method Recommendation [16.541442856821]
APIGen is a generative API recommendation approach through enhanced in-context learning (ICL)
APIGen searches for similar posts to the programming queries from the lexical, syntactical, and semantic perspectives.
With the reasoning process, APIGen makes recommended APIs better meet the programming requirement of queries.
arXiv Detail & Related papers (2024-01-29T02:35:42Z) - Adaptive REST API Testing with Reinforcement Learning [54.68542517176757]
Current testing tools lack efficient exploration mechanisms, treating all operations and parameters equally.
Current tools struggle when response schemas are absent in the specification or exhibit variants.
We present an adaptive REST API testing technique incorporates reinforcement learning to prioritize operations during exploration.
arXiv Detail & Related papers (2023-09-08T20:27:05Z) - Evaluating Embedding APIs for Information Retrieval [51.24236853841468]
We evaluate the capabilities of existing semantic embedding APIs on domain generalization and multilingual retrieval.
We find that re-ranking BM25 results using the APIs is a budget-friendly approach and is most effective in English.
For non-English retrieval, re-ranking still improves the results, but a hybrid model with BM25 works best, albeit at a higher cost.
arXiv Detail & Related papers (2023-05-10T16:40:52Z) - On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc.
Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner.
Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z) - Holistic Combination of Structural and Textual Code Information for
Context based API Recommendation [28.74546332681778]
We propose a novel API recommendation approach called APIRec-CST (API Recommendation by Combining Structural and Textual code information)
APIRec-CST is a deep learning model that combines the API usage with the text information in source code based on an API Graph Network and a Code Token Network.
We show that our approach achieves a top-5, top-10 accuracy and MRR of 60.3%, 81.5%, 87.7% and 69.4%, and significantly outperforms an existing graph-based statistical approach.
arXiv Detail & Related papers (2020-10-15T04:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.