Code2API: A Tool for Generating Reusable APIs from Stack Overflow Code Snippets
- URL: http://arxiv.org/abs/2504.14331v1
- Date: Sat, 19 Apr 2025 15:49:03 GMT
- Title: Code2API: A Tool for Generating Reusable APIs from Stack Overflow Code Snippets
- Authors: Yubo Mai, Zhipeng Gao, Xing Hu, Lingfeng Bao, Jingyuan Chen, Jianling Sun,
- Abstract summary: Code2API is a Google Chrome extension that uses Large Language Models (LLMs) to automatically perform APIzation of code snippets on Stack Overflow.<n>The evaluation results show that Code2API significantly outperforms the rule-based approach by a large margin.
- Score: 14.130403020877848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, developers often turn to Stack Overflow for solutions to daily problems, however, these code snippets are partial code that cannot be tested and verified properly. One way to test these code snippets is to transform them into APIs (Application Program Interface) that developers can be directly invoked and executed. However, it is often costly and error-prone for developers to manually perform this transformation (referred to as AIPzation task) due to different actions to be taken (e.g., summarizing proper method names, inferring input parameters list and return statements). To help developers quickly reuse code snippets in Stack Overflow, in this paper, we propose Code2API, a Google Chrome extension that uses Large Language Models (LLMs) to automatically perform APIzation of code snippets on Stack Overflow. \toolname guides LLMs through well-designed prompts to generate reusable APIs, using Chain-of-Thought reasoning and few-shot in-context learning to help LLMs understand and solve the APIzation task in a developer-like manner. The evaluation results show that Code2API significantly outperforms the rule-based approach by a large margin.
Related papers
- Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models [49.214291813478695]
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like overflows and use buffer-free errors.<n>Traditional fuzzing struggles with the complexity and API diversity of DL libraries.<n>We propose DFUZZ, an LLM-driven fuzzing approach for DL libraries.
arXiv Detail & Related papers (2025-01-08T07:07:22Z) - ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution.<n>We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z) - A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models [14.665460257371164]
Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation.
We propose AutoAPIEval, a framework designed to evaluate the capabilities of LLMs in API-oriented code generation.
arXiv Detail & Related papers (2024-09-23T17:22:09Z) - VersiCode: Towards Version-controllable Code Generation [58.82709231906735]
Large Language Models (LLMs) have made tremendous strides in code generation, but existing research fails to account for the dynamic nature of software development.
We propose two novel tasks aimed at bridging this gap: version-specific code completion (VSCC) and version-aware code migration (VACM)
We conduct an extensive evaluation on VersiCode, which reveals that version-controllable code generation is indeed a significant challenge.
arXiv Detail & Related papers (2024-06-11T16:15:06Z) - Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning [14.351476383642016]
We propose a novel approach, named Code2API, to automatically perform APIzation for Stack Overflow code snippets.
Code2API does not require additional model training or any manual crafting rules.
It can be easily deployed on personal computers without relying on other external tools.
arXiv Detail & Related papers (2024-05-06T14:22:17Z) - Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy.
Results indicate that MANGO significantly improves the code pass rate based on the strong baselines.
The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z) - Function-constrained Program Synthesis [12.55507214959886]
Large language models (LLMs) can generate code in real-time by drawing on all code available in a development environment.
Current systems lack effective recovery methods, forcing users to iteratively re-prompt the model with modified prompts until a sufficient solution is reached.
Our method constrains code-generation to an explicit function set and enabling recovery from failed attempts through automatically generated sub-functions.
arXiv Detail & Related papers (2023-11-27T02:55:34Z) - Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability
of Large Language Model Code Generation [8.575560293086289]
Large language models (LLMs) have shown extraordinary ability in understanding natural language and generating programming code.
The misuse of APIs in the generated code could lead to severe problem, such as resource leaks, program crashes.
arXiv Detail & Related papers (2023-08-20T18:36:28Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - Binding Language Models in Symbolic Languages [146.3027328556881]
Binder is a training-free neural-symbolic framework that maps the task input to a program.
In the parsing stage, Codex is able to identify the part of the task input that cannot be answerable by the original programming language.
In the execution stage, Codex can perform versatile functionalities given proper prompts in the API calls.
arXiv Detail & Related papers (2022-10-06T12:55:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.