Related papers: Towards Code Summarization of APIs Based on Unofficial Documentation Using NLP Techniques

Towards Code Summarization of APIs Based on Unofficial Documentation Using NLP Techniques

URL: http://arxiv.org/abs/2208.06318v3
Date: Mon, 6 Nov 2023 21:03:08 GMT
Title: Towards Code Summarization of APIs Based on Unofficial Documentation Using NLP Techniques
Authors: AmirHossein Naghshzan
Abstract summary: In some cases, official documentation is not an efficient way to get the needed information. We propose an automatic approach to generate summaries for APIs and methods by leveraging unofficial documentation using NLP techniques.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Each programming language comes with official documentation to guide developers with APIs, methods, and classes. However, in some cases, official documentation is not an efficient way to get the needed information. As a result, developers may consult other sources (e.g., Stack Overflow, GitHub) to learn more about an API, its implementation, usage, and other information that official documentation may not provide. In this research, we propose an automatic approach to generate summaries for APIs and methods by leveraging unofficial documentation using NLP techniques. Our findings demonstrate that the generated summaries are competitive, and can be used as a complementary source for guiding developers in software development and maintenance tasks.

Related papers

ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution. We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development. Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z)
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability. Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z)
SpeCrawler: Generating OpenAPI Specifications from API Documentation Using Large Language Models [8.372941103284774]
SpeCrawler is a comprehensive system that generates OpenAPI Specifications from diverse API documentation. The paper explores SpeCrawler's methodology, supported by empirical evidence and case studies.
arXiv Detail & Related papers (2024-02-18T15:33:24Z)
Revolutionizing API Documentation through Summarization [0.0]
API documentation can be lengthy and challenging to navigate, prompting developers to seek unofficial sources such as Stack Overflow. We employ BERTopic and extractive summarization to automatically generate concise and informative API summaries. These summaries encompass key insights like general usage, common developer issues, and potential solutions, sourced from the wealth of knowledge on Stack Overflow.
arXiv Detail & Related papers (2024-01-21T01:18:08Z)
Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation [1.1816942730023887]
This paper proposes an automatic approach using the BART algorithm to generate summaries for APIs discussed in StackOverflow. We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics. Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision.
arXiv Detail & Related papers (2023-10-23T15:10:37Z)
Enhancing API Documentation through BERTopic Modeling and Summarization [0.0]
This paper focuses on the complexities of interpreting Application Programming Interface (API) documentation. Official API documentation serves as a primary source of information for developers, but it can often be extensive and lacks user-friendliness. Our novel approach employs the strengths of BERTopic for topic modeling and Natural Language Processing (NLP) to automatically generate summaries of API documentation.
arXiv Detail & Related papers (2023-08-17T15:57:12Z)
Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries. We propose a novel framework that emulates the process of programmers writing private code. We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z)
Binding Language Models in Symbolic Languages [146.3027328556881]
Binder is a training-free neural-symbolic framework that maps the task input to a program. In the parsing stage, Codex is able to identify the part of the task input that cannot be answerable by the original programming language. In the execution stage, Codex can perform versatile functionalities given proper prompts in the API calls.
arXiv Detail & Related papers (2022-10-06T12:55:17Z)
DocCoder: Generating Code by Retrieving and Reading Docs [87.88474546826913]
We introduce DocCoder, an approach that explicitly leverages code manuals and documentation. Our approach is general, can be applied to any programming language, and is agnostic to the underlying neural model.
arXiv Detail & Related papers (2022-07-13T06:47:51Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)
Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow [1.8047694351309207]
This paper proposes an automatic and novel approach for summarizing Android API methods discussed in Stack Overflow. Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that API method. We have conducted a survey that involves 16 Android developers to evaluate the quality of our automatically generated summaries and compare them with the official Android documentation.
arXiv Detail & Related papers (2021-11-27T18:49:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.