Related papers: Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

URL: http://arxiv.org/abs/2111.13962v1
Date: Sat, 27 Nov 2021 18:49:51 GMT
Title: Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow
Authors: AmirHossein Naghshzan, Latifa Guerrouj, Olga Baysal
Abstract summary: This paper proposes an automatic and novel approach for summarizing Android API methods discussed in Stack Overflow. Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that API method. We have conducted a survey that involves 16 Android developers to evaluate the quality of our automatically generated summaries and compare them with the official Android documentation.
Score: 1.8047694351309207
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated source code summarization is a task that generates summarized information about the purpose, usage, and--or implementation of methods and classes to support understanding of these code entities. Multiple approaches and techniques have been proposed for supervised and unsupervised learning in code summarization, however, they were mostly focused on generating a summary for a piece of code. In addition, very few works have leveraged unofficial documentation. This paper proposes an automatic and novel approach for summarizing Android API methods discussed in Stack Overflow that we consider as unofficial documentation in this research. Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that API method. We have conducted a survey that involves 16 Android developers to evaluate the quality of our automatically generated summaries and compare them with the official Android documentation. Our results demonstrate that while developers find the official documentation more useful in general, the generated summaries are also competitive, in particular for offering implementation details, and can be used as a complementary source for guiding developers in software development and maintenance tasks.

Related papers

ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution. We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
Revolutionizing API Documentation through Summarization [0.0]
API documentation can be lengthy and challenging to navigate, prompting developers to seek unofficial sources such as Stack Overflow. We employ BERTopic and extractive summarization to automatically generate concise and informative API summaries. These summaries encompass key insights like general usage, common developer issues, and potential solutions, sourced from the wealth of knowledge on Stack Overflow.
arXiv Detail & Related papers (2024-01-21T01:18:08Z)
Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation [1.1816942730023887]
This paper proposes an automatic approach using the BART algorithm to generate summaries for APIs discussed in StackOverflow. We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics. Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision.
arXiv Detail & Related papers (2023-10-23T15:10:37Z)
Enhancing API Documentation through BERTopic Modeling and Summarization [0.0]
This paper focuses on the complexities of interpreting Application Programming Interface (API) documentation. Official API documentation serves as a primary source of information for developers, but it can often be extensive and lacks user-friendliness. Our novel approach employs the strengths of BERTopic for topic modeling and Natural Language Processing (NLP) to automatically generate summaries of API documentation.
arXiv Detail & Related papers (2023-08-17T15:57:12Z)
Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries. We propose a novel framework that emulates the process of programmers writing private code. We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z)
CodeExp: Explanatory Code Document Generation [94.43677536210465]
Existing code-to-text generation models produce only high-level summaries of code. We conduct a human study to identify the criteria for high-quality explanatory docstring for code. We present a multi-stage fine-tuning strategy and baseline models for the task.
arXiv Detail & Related papers (2022-11-25T18:05:44Z)
Towards Code Summarization of APIs Based on Unofficial Documentation Using NLP Techniques [0.0]
In some cases, official documentation is not an efficient way to get the needed information. We propose an automatic approach to generate summaries for APIs and methods by leveraging unofficial documentation using NLP techniques.
arXiv Detail & Related papers (2022-08-12T15:07:30Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)
Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach [5.577102440028882]
We design a novel multi-task learning (MTL) approach for code summarization. We first introduce the tasks of generation and informativeness prediction of method names. A novel two-pass deliberation mechanism is then incorporated into our MTL architecture to generate more consistent intermediate states.
arXiv Detail & Related papers (2021-03-21T17:52:21Z)
Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision. Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z)
Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process. Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved. We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z)
A Methodology for Creating AI FactSheets [67.65802440158753]
This paper describes a methodology for creating the form of AI documentation we call FactSheets. Within each step of the methodology, we describe the issues to consider and the questions to explore. This methodology will accelerate the broader adoption of transparent AI documentation.
arXiv Detail & Related papers (2020-06-24T15:08:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.