Code Librarian: A Software Package Recommendation System
- URL: http://arxiv.org/abs/2210.05406v1
- Date: Tue, 11 Oct 2022 12:30:05 GMT
- Title: Code Librarian: A Software Package Recommendation System
- Authors: Lili Tao, Alexandru-Petre Cazan, Senad Ibraimoski, Sean Moran
- Abstract summary: We present a recommendation engine called Librarian for open source libraries.
A candidate library package is recommended for a given context if: 1) it has been frequently used with the imported libraries in the program; 2) it has similar functionality to the imported libraries in the program; 3) it has similar functionality to the developer's implementation, and 4) it can be used efficiently in the context of the provided code.
- Score: 65.05559087332347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of packaged libraries can significantly shorten the software
development cycle by improving the quality and readability of code. In this
paper, we present a recommendation engine called Librarian for open source
libraries. A candidate library package is recommended for a given context if:
1) it has been frequently used with the imported libraries in the program; 2)
it has similar functionality to the imported libraries in the program; 3) it
has similar functionality to the developer's implementation, and 4) it can be
used efficiently in the context of the provided code. We apply the
state-of-the-art CodeBERT-based model for analysing the context of the source
code to deliver relevant library recommendations to users.
Related papers
- Is ChatGPT a Good Software Librarian? An Exploratory Study on the Use of ChatGPT for Software Library Recommendations [3.1911318265930944]
ChatGPT uses third-party libraries nearly 10% more often than human developers.
14.2% of the recommended libraries had restrictive copyleft licenses, which were not explicitly communicated by ChatGPT.
We recommend that developers implement rigorous dependency management practices and double-check library licenses before integrating LLM-generated code into their projects.
arXiv Detail & Related papers (2024-08-09T15:36:59Z) - LILO: Learning Interpretable Libraries by Compressing and Documenting Code [71.55208585024198]
We introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code.
LILO combines LLM-guided program synthesis with recent algorithmic advances in automated from Stitch.
We find that AutoDoc boosts performance by helping LILO's synthesizer to interpret and deploy learned abstractions.
arXiv Detail & Related papers (2023-10-30T17:55:02Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - AndroLibZoo: A Reliable Dataset of Libraries Based on Software
Dependency Analysis [6.342380566583581]
We propose an automated approach to produce an accurate and up-to-date set of third-party libraries in the form of a dataset called AndroLibZoo.
Our dataset, which we make available to the community, contains to date 34 813 libraries and is meant to evolve.
arXiv Detail & Related papers (2023-07-24T08:36:38Z) - An Empirical Study of Library Usage and Dependency in Deep Learning
Frameworks [12.624032509149869]
pytorch, Caffe, and Scikit-learn are the most frequent combination in 18% and 14% of the projects.
The developer uses two or three dl libraries in the same projects and tends to use different multiple dl libraries in both the same function and the same files.
arXiv Detail & Related papers (2022-11-28T19:31:56Z) - Lib-SibGMU -- A University Library Circulation Dataset for Recommender
Systems Developmen [58.720142291102135]
We opensource Lib-SibGMU - a university library circulation dataset.
For a recommender architecture that consists of a vectorizer that turns the history of the books borrowed into a vector, we show that using the fastText model as a vectorizer delivers competitive results.
arXiv Detail & Related papers (2022-08-25T22:10:18Z) - DocCoder: Generating Code by Retrieving and Reading Docs [87.88474546826913]
We introduce DocCoder, an approach that explicitly leverages code manuals and documentation.
Our approach is general, can be applied to any programming language, and is agnostic to the underlying neural model.
arXiv Detail & Related papers (2022-07-13T06:47:51Z) - Repro: An Open-Source Library for Improving the Reproducibility and
Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code.
It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z) - Req2Lib: A Semantic Neural Model for Software Library Recommendation [8.713783358744166]
We propose a novel neural approach called Req2Lib which recommends libraries given descriptions of the project requirement.
We use a Sequence-to-Sequence model to learn the library linked-usage information and semantic information of requirement descriptions in natural language.
Our preliminary evaluation demonstrates that Req2Lib can recommend libraries accurately.
arXiv Detail & Related papers (2020-05-24T14:37:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.