Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack
Overflow
- URL: http://arxiv.org/abs/2111.13962v1
- Date: Sat, 27 Nov 2021 18:49:51 GMT
- Title: Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack
Overflow
- Authors: AmirHossein Naghshzan, Latifa Guerrouj, Olga Baysal
- Abstract summary: This paper proposes an automatic and novel approach for summarizing Android API methods discussed in Stack Overflow.
Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that API method.
We have conducted a survey that involves 16 Android developers to evaluate the quality of our automatically generated summaries and compare them with the official Android documentation.
- Score: 1.8047694351309207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated source code summarization is a task that generates summarized
information about the purpose, usage, and--or implementation of methods and
classes to support understanding of these code entities. Multiple approaches
and techniques have been proposed for supervised and unsupervised learning in
code summarization, however, they were mostly focused on generating a summary
for a piece of code. In addition, very few works have leveraged unofficial
documentation. This paper proposes an automatic and novel approach for
summarizing Android API methods discussed in Stack Overflow that we consider as
unofficial documentation in this research. Our approach takes the API method's
name as an input and generates a natural language summary based on Stack
Overflow discussions of that API method. We have conducted a survey that
involves 16 Android developers to evaluate the quality of our automatically
generated summaries and compare them with the official Android documentation.
Our results demonstrate that while developers find the official documentation
more useful in general, the generated summaries are also competitive, in
particular for offering implementation details, and can be used as a
complementary source for guiding developers in software development and
maintenance tasks.
Related papers
- Revolutionizing API Documentation through Summarization [0.0]
API documentation can be lengthy and challenging to navigate, prompting developers to seek unofficial sources such as Stack Overflow.
We employ BERTopic and extractive summarization to automatically generate concise and informative API summaries.
These summaries encompass key insights like general usage, common developer issues, and potential solutions, sourced from the wealth of knowledge on Stack Overflow.
arXiv Detail & Related papers (2024-01-21T01:18:08Z) - Leveraging Deep Learning for Abstractive Code Summarization of
Unofficial Documentation [1.1816942730023887]
This paper proposes an automatic approach using the BART algorithm to generate summaries for APIs discussed in StackOverflow.
We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics.
Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision.
arXiv Detail & Related papers (2023-10-23T15:10:37Z) - Enhancing API Documentation through BERTopic Modeling and Summarization [0.0]
This paper focuses on the complexities of interpreting Application Programming Interface (API) documentation.
Official API documentation serves as a primary source of information for developers, but it can often be extensive and lacks user-friendliness.
Our novel approach employs the strengths of BERTopic for topic modeling and Natural Language Processing (NLP) to automatically generate summaries of API documentation.
arXiv Detail & Related papers (2023-08-17T15:57:12Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - CodeExp: Explanatory Code Document Generation [94.43677536210465]
Existing code-to-text generation models produce only high-level summaries of code.
We conduct a human study to identify the criteria for high-quality explanatory docstring for code.
We present a multi-stage fine-tuning strategy and baseline models for the task.
arXiv Detail & Related papers (2022-11-25T18:05:44Z) - Towards Code Summarization of APIs Based on Unofficial Documentation
Using NLP Techniques [0.0]
In some cases, official documentation is not an efficient way to get the needed information.
We propose an automatic approach to generate summaries for APIs and methods by leveraging unofficial documentation using NLP techniques.
arXiv Detail & Related papers (2022-08-12T15:07:30Z) - ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval.
We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z) - Exploiting Method Names to Improve Code Summarization: A Deliberation
Multi-Task Learning Approach [5.577102440028882]
We design a novel multi-task learning (MTL) approach for code summarization.
We first introduce the tasks of generation and informativeness prediction of method names.
A novel two-pass deliberation mechanism is then incorporated into our MTL architecture to generate more consistent intermediate states.
arXiv Detail & Related papers (2021-03-21T17:52:21Z) - Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via
Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision.
Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z) - A Methodology for Creating AI FactSheets [67.65802440158753]
This paper describes a methodology for creating the form of AI documentation we call FactSheets.
Within each step of the methodology, we describe the issues to consider and the questions to explore.
This methodology will accelerate the broader adoption of transparent AI documentation.
arXiv Detail & Related papers (2020-06-24T15:08:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.