Revolutionizing API Documentation through Summarization
- URL: http://arxiv.org/abs/2401.11361v1
- Date: Sun, 21 Jan 2024 01:18:08 GMT
- Title: Revolutionizing API Documentation through Summarization
- Authors: AmirHossein Naghshzan, Sylvie Ratte
- Abstract summary: API documentation can be lengthy and challenging to navigate, prompting developers to seek unofficial sources such as Stack Overflow.
We employ BERTopic and extractive summarization to automatically generate concise and informative API summaries.
These summaries encompass key insights like general usage, common developer issues, and potential solutions, sourced from the wealth of knowledge on Stack Overflow.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study tackles the challenges associated with interpreting Application
Programming Interface (API) documentation, an integral aspect of software
development. Official API documentation, while essential, can be lengthy and
challenging to navigate, prompting developers to seek unofficial sources such
as Stack Overflow. Leveraging the vast user-generated content on Stack
Overflow, including code snippets and discussions, we employ BERTopic and
extractive summarization to automatically generate concise and informative API
summaries. These summaries encompass key insights like general usage, common
developer issues, and potential solutions, sourced from the wealth of knowledge
on Stack Overflow. Software developers evaluate these summaries for
performance, coherence, and interoperability, providing valuable feedback on
the practicality of our approach.
Related papers
- An Empirical Investigation on the Challenges in Scientific Workflow Systems Development [2.704899832646869]
This study examines interactions between developers and researchers on Stack Overflow (SO) and GitHub.
By analyzing issues, we identified 13 topics (e.g., Errors and Bug Fixing, Documentation, Dependencies) and discovered that data structures and operations is the most difficult.
We also found common topics between SO and GitHub, such as data structures and operations, task management, and workflow scheduling.
arXiv Detail & Related papers (2024-11-16T21:14:11Z) - How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE)
We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories.
To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - DevBench: A Comprehensive Benchmark for Software Development [72.24266814625685]
DevBench is a benchmark that evaluates large language models (LLMs) across various stages of the software development lifecycle.
Empirical studies show that current LLMs, including GPT-4-Turbo, fail to solve the challenges presented within DevBench.
Our findings offer actionable insights for the future development of LLMs toward real-world programming applications.
arXiv Detail & Related papers (2024-03-13T15:13:44Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Leveraging Deep Learning for Abstractive Code Summarization of
Unofficial Documentation [1.1816942730023887]
This paper proposes an automatic approach using the BART algorithm to generate summaries for APIs discussed in StackOverflow.
We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics.
Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision.
arXiv Detail & Related papers (2023-10-23T15:10:37Z) - Enhancing API Documentation through BERTopic Modeling and Summarization [0.0]
This paper focuses on the complexities of interpreting Application Programming Interface (API) documentation.
Official API documentation serves as a primary source of information for developers, but it can often be extensive and lacks user-friendliness.
Our novel approach employs the strengths of BERTopic for topic modeling and Natural Language Processing (NLP) to automatically generate summaries of API documentation.
arXiv Detail & Related papers (2023-08-17T15:57:12Z) - Towards Code Summarization of APIs Based on Unofficial Documentation
Using NLP Techniques [0.0]
In some cases, official documentation is not an efficient way to get the needed information.
We propose an automatic approach to generate summaries for APIs and methods by leveraging unofficial documentation using NLP techniques.
arXiv Detail & Related papers (2022-08-12T15:07:30Z) - Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack
Overflow [1.8047694351309207]
This paper proposes an automatic and novel approach for summarizing Android API methods discussed in Stack Overflow.
Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that API method.
We have conducted a survey that involves 16 Android developers to evaluate the quality of our automatically generated summaries and compare them with the official Android documentation.
arXiv Detail & Related papers (2021-11-27T18:49:51Z) - Dive into Deep Learning [119.30375933463156]
The book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code.
Our goal is to offer a resource that could (i) be freely available for everyone; (ii) offer sufficient technical depth to provide a starting point on the path to becoming an applied machine learning scientist; (iii) include runnable code, showing readers how to solve problems in practice; (iv) allow for rapid updates, both by us and also by the community at large.
arXiv Detail & Related papers (2021-06-21T18:19:46Z) - Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via
Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision.
Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z) - Semantic Graphs for Generating Deep Questions [98.5161888878238]
We propose a novel framework which first constructs a semantic-level graph for the input document and then encodes the semantic graph by introducing an attention-based GGNN (Att-GGNN)
On the HotpotQA deep-question centric dataset, our model greatly improves performance over questions requiring reasoning over multiple facts, leading to state-of-the-art performance.
arXiv Detail & Related papers (2020-04-27T10:52:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.