API2Com: On the Improvement of Automatically Generated Code Comments
Using API Documentations
- URL: http://arxiv.org/abs/2103.10668v1
- Date: Fri, 19 Mar 2021 07:29:40 GMT
- Title: API2Com: On the Improvement of Automatically Generated Code Comments
Using API Documentations
- Authors: Ramin Shahbazi, Rishab Sharma, Fatemeh H. Fard
- Abstract summary: We propose API2Com, a model that leverages the Application Programming Interface Documentations (API Docs) as a knowledge resource for comment generation.
We apply the model on a large Java dataset of over 130,000 methods and evaluate it using both Transformer and RNN-base architectures.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Code comments can help in program comprehension and are considered as
important artifacts to help developers in software maintenance. However, the
comments are mostly missing or are outdated, specially in complex software
projects. As a result, several automatic comment generation models are
developed as a solution. The recent models explore the integration of external
knowledge resources such as Unified Modeling Language class diagrams to improve
the generated comments. In this paper, we propose API2Com, a model that
leverages the Application Programming Interface Documentations (API Docs) as a
knowledge resource for comment generation. The API Docs include the description
of the methods in more details and therefore, can provide better context in the
generated comments. The API Docs are used along with the code snippets and
Abstract Syntax Trees in our model. We apply the model on a large Java dataset
of over 130,000 methods and evaluate it using both Transformer and RNN-base
architectures. Interestingly, when API Docs are used, the performance increase
is negligible. We therefore run different experiments to reason about the
results. For methods that only contain one API, adding API Docs improves the
results by 4% BLEU score on average (BLEU score is an automatic evaluation
metric used in machine translation). However, as the number of APIs that are
used in a method increases, the performance of the model in generating comments
decreases due to long documentations used in the input. Our results confirm
that the API Docs can be useful in generating better comments, but, new
techniques are required to identify the most informative ones in a method
rather than using all documentations simultaneously.
Related papers
- A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking.
It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.
Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - Leveraging Deep Learning for Abstractive Code Summarization of
Unofficial Documentation [1.1816942730023887]
This paper proposes an automatic approach using the BART algorithm to generate summaries for APIs discussed in StackOverflow.
We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics.
Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision.
arXiv Detail & Related papers (2023-10-23T15:10:37Z) - APICom: Automatic API Completion via Prompt Learning and Adversarial
Training-based Data Augmentation [6.029137544885093]
API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs.
Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need.
Motivated by the neural machine translation research domain, we can model this problem as the generation task.
We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
arXiv Detail & Related papers (2023-09-13T15:31:50Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - APIContext2Com: Code Comment Generation by Incorporating Pre-Defined API
Documentation [0.0]
We introduce a seq-2-seq encoder-decoder neural network model with different sets of multiple encoders to transform distinct inputs into target comments.
A ranking mechanism is also developed to exclude non-informative APIs, so that we can evaluate our approach using the Java dataset from CodeSearchNet.
arXiv Detail & Related papers (2023-03-03T00:38:01Z) - Binding Language Models in Symbolic Languages [146.3027328556881]
Binder is a training-free neural-symbolic framework that maps the task input to a program.
In the parsing stage, Codex is able to identify the part of the task input that cannot be answerable by the original programming language.
In the execution stage, Codex can perform versatile functionalities given proper prompts in the API calls.
arXiv Detail & Related papers (2022-10-06T12:55:17Z) - DocCoder: Generating Code by Retrieving and Reading Docs [87.88474546826913]
We introduce DocCoder, an approach that explicitly leverages code manuals and documentation.
Our approach is general, can be applied to any programming language, and is agnostic to the underlying neural model.
arXiv Detail & Related papers (2022-07-13T06:47:51Z) - On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc.
Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner.
Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z) - Holistic Combination of Structural and Textual Code Information for
Context based API Recommendation [28.74546332681778]
We propose a novel API recommendation approach called APIRec-CST (API Recommendation by Combining Structural and Textual code information)
APIRec-CST is a deep learning model that combines the API usage with the text information in source code based on an API Graph Network and a Code Token Network.
We show that our approach achieves a top-5, top-10 accuracy and MRR of 60.3%, 81.5%, 87.7% and 69.4%, and significantly outperforms an existing graph-based statistical approach.
arXiv Detail & Related papers (2020-10-15T04:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.