APIGen: Generative API Method Recommendation
- URL: http://arxiv.org/abs/2401.15843v1
- Date: Mon, 29 Jan 2024 02:35:42 GMT
- Title: APIGen: Generative API Method Recommendation
- Authors: Yujia Chen, Cuiyun Gao, Muyijie Zhu, Qing Liao, Yong Wang, Guoai Xu
- Abstract summary: APIGen is a generative API recommendation approach through enhanced in-context learning (ICL)
APIGen searches for similar posts to the programming queries from the lexical, syntactical, and semantic perspectives.
With the reasoning process, APIGen makes recommended APIs better meet the programming requirement of queries.
- Score: 16.541442856821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic API method recommendation is an essential task of code
intelligence, which aims to suggest suitable APIs for programming queries.
Existing approaches can be categorized into two primary groups: retrieval-based
and learning-based approaches. Although these approaches have achieved
remarkable success, they still come with notable limitations. The
retrieval-based approaches rely on the text representation capabilities of
embedding models, while the learning-based approaches require extensive
task-specific labeled data for training. To mitigate the limitations, we
propose APIGen, a generative API recommendation approach through enhanced
in-context learning (ICL). APIGen involves two main components: (1) Diverse
Examples Selection. APIGen searches for similar posts to the programming
queries from the lexical, syntactical, and semantic perspectives, providing
more informative examples for ICL. (2) Guided API Recommendation. APIGen
enables large language models (LLMs) to perform reasoning before generating API
recommendations, where the reasoning involves fine-grained matching between the
task intent behind the queries and the factual knowledge of the APIs. With the
reasoning process, APIGen makes recommended APIs better meet the programming
requirement of queries and also enhances the interpretability of results. We
compare APIGen with four existing approaches on two publicly available
benchmarks. Experiments show that APIGen outperforms the best baseline CLEAR by
105.8% in method-level API recommendation and 54.3% in class-level API
recommendation in terms of SuccessRate@1. Besides, APIGen achieves an average
49.87% increase compared to the zero-shot performance of popular LLMs such as
GPT-4 in method-level API recommendation regarding the SuccessRate@3 metric.
Related papers
- A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking.
It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.
Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - Compositional API Recommendation for Library-Oriented Code Generation [23.355509276291198]
We propose CAPIR, which adopts a "divide-and-conquer" strategy to recommend APIs for coarse-grained requirements.
We present two challenging benchmarks, RAPID (Recommend APIs based on Documentation) and LOCG (Library-Oriented Code Generation)
Experimental results on these benchmarks, demonstrate the effectiveness of CAPIR in comparison to existing baselines.
arXiv Detail & Related papers (2024-02-29T18:27:27Z) - Leveraging Large Language Models to Improve REST API Testing [51.284096009803406]
RESTGPT takes as input an API specification, extracts machine-interpretable rules, and generates example parameter values from natural-language descriptions in the specification.
Our evaluations indicate that RESTGPT outperforms existing techniques in both rule extraction and value generation.
arXiv Detail & Related papers (2023-12-01T19:53:23Z) - APICom: Automatic API Completion via Prompt Learning and Adversarial
Training-based Data Augmentation [6.029137544885093]
API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs.
Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need.
Motivated by the neural machine translation research domain, we can model this problem as the generation task.
We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
arXiv Detail & Related papers (2023-09-13T15:31:50Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - Evaluating Embedding APIs for Information Retrieval [51.24236853841468]
We evaluate the capabilities of existing semantic embedding APIs on domain generalization and multilingual retrieval.
We find that re-ranking BM25 results using the APIs is a budget-friendly approach and is most effective in English.
For non-English retrieval, re-ranking still improves the results, but a hybrid model with BM25 works best, albeit at a higher cost.
arXiv Detail & Related papers (2023-05-10T16:40:52Z) - On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc.
Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner.
Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z) - Embedding Code Contexts for Cryptographic API Suggestion:New
Methodologies and Comparisons [9.011910726620536]
We present a new neural network-based approach, Multi-HyLSTM for API recommendation.
We use program analysis to guide the API embedding and recommendation.
In an analysis of 245 test cases, compared with the commercial tool Codota, we achieve a top-1 recommendation accuracy of 88.98%.
arXiv Detail & Related papers (2021-03-15T22:27:57Z) - Holistic Combination of Structural and Textual Code Information for
Context based API Recommendation [28.74546332681778]
We propose a novel API recommendation approach called APIRec-CST (API Recommendation by Combining Structural and Textual code information)
APIRec-CST is a deep learning model that combines the API usage with the text information in source code based on an API Graph Network and a Code Token Network.
We show that our approach achieves a top-5, top-10 accuracy and MRR of 60.3%, 81.5%, 87.7% and 69.4%, and significantly outperforms an existing graph-based statistical approach.
arXiv Detail & Related papers (2020-10-15T04:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.