APIContext2Com: Code Comment Generation by Incorporating Pre-Defined API
Documentation
- URL: http://arxiv.org/abs/2303.01645v1
- Date: Fri, 3 Mar 2023 00:38:01 GMT
- Title: APIContext2Com: Code Comment Generation by Incorporating Pre-Defined API
Documentation
- Authors: Ramin Shahbazi, Fatemeh Fard
- Abstract summary: We introduce a seq-2-seq encoder-decoder neural network model with different sets of multiple encoders to transform distinct inputs into target comments.
A ranking mechanism is also developed to exclude non-informative APIs, so that we can evaluate our approach using the Java dataset from CodeSearchNet.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Code comments are significantly helpful in comprehending software programs
and also aid developers to save a great deal of time in software maintenance.
Code comment generation aims to automatically predict comments in natural
language given a code snippet. Several works investigate the effect of
integrating external knowledge on the quality of generated comments. In this
study, we propose a solution, namely APIContext2Com, to improve the
effectiveness of generated comments by incorporating the pre-defined
Application Programming Interface (API) context. The API context includes the
definition and description of the pre-defined APIs that are used within the
code snippets. As the detailed API information expresses the functionality of a
code snippet, it can be helpful in better generating the code summary. We
introduce a seq-2-seq encoder-decoder neural network model with different sets
of multiple encoders to effectively transform distinct inputs into target
comments. A ranking mechanism is also developed to exclude non-informative
APIs, so that we can filter out unrelated APIs. We evaluate our approach using
the Java dataset from CodeSearchNet. The findings reveal that the proposed
model improves the best baseline by 1.88 (8.24 %), 2.16 (17.58 %), 1.38 (18.3
%), 0.73 (14.17 %), 1.58 (14.98 %) and 1.9 (6.92 %) for BLEU1, BLEU2, BLEU3,
BLEU4, METEOR, ROUGE-L respectively. Human evaluation and ablation studies
confirm the quality of the generated comments and the effect of architecture
and ranking APIs.
Related papers
- A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models [14.665460257371164]
Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation.
We propose AutoAPIEval, a framework designed to evaluate the capabilities of LLMs in API-oriented code generation.
arXiv Detail & Related papers (2024-09-23T17:22:09Z) - A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy.
Results indicate that MANGO significantly improves the code pass rate based on the strong baselines.
The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - Interactive Code Generation via Test-Driven User-Intent Formalization [60.90035204567797]
Large language models (LLMs) produce code from informal natural language (NL) intent.
It is hard to define a notion of correctness since natural language can be ambiguous and lacks a formal semantics.
We describe a language-agnostic abstract algorithm and a concrete implementation TiCoder.
arXiv Detail & Related papers (2022-08-11T17:41:08Z) - On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc.
Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner.
Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z) - Embedding API Dependency Graph for Neural Code Generation [14.246659920310003]
We propose to model the dependencies among API methods as an API dependency graph (ADG) and incorporate the graph embedding into a sequence-to-sequence model.
In this way, the decoder can utilize both global structural dependencies and textual program description to predict the target code.
Our proposed approach, called ADG-Seq2Seq, yields significant improvements over existing state-of-the-art methods.
arXiv Detail & Related papers (2021-03-29T06:26:38Z) - API2Com: On the Improvement of Automatically Generated Code Comments
Using API Documentations [0.0]
We propose API2Com, a model that leverages the Application Programming Interface Documentations (API Docs) as a knowledge resource for comment generation.
We apply the model on a large Java dataset of over 130,000 methods and evaluate it using both Transformer and RNN-base architectures.
arXiv Detail & Related papers (2021-03-19T07:29:40Z) - Holistic Combination of Structural and Textual Code Information for
Context based API Recommendation [28.74546332681778]
We propose a novel API recommendation approach called APIRec-CST (API Recommendation by Combining Structural and Textual code information)
APIRec-CST is a deep learning model that combines the API usage with the text information in source code based on an API Graph Network and a Code Token Network.
We show that our approach achieves a top-5, top-10 accuracy and MRR of 60.3%, 81.5%, 87.7% and 69.4%, and significantly outperforms an existing graph-based statistical approach.
arXiv Detail & Related papers (2020-10-15T04:40:42Z) - Contrastive Code Representation Learning [95.86686147053958]
We show that the popular reconstruction-based BERT model is sensitive to source code edits, even when the edits preserve semantics.
We propose ContraCode: a contrastive pre-training task that learns code functionality, not form.
arXiv Detail & Related papers (2020-07-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.