Do Not Take It for Granted: Comparing Open-Source Libraries for Software
Development Effort Estimation
- URL: http://arxiv.org/abs/2207.01705v1
- Date: Mon, 4 Jul 2022 20:06:40 GMT
- Title: Do Not Take It for Granted: Comparing Open-Source Libraries for Software
Development Effort Estimation
- Authors: Rebecca Moussa and Federica Sarro
- Abstract summary: This paper aims at raising awareness of the differences incurred when using different Machine Learning (ML) libraries for software development effort estimation (SEE)
We investigate 4 deterministic machine learners as provided by 3 of the most popular ML open-source libraries written in different languages (namely, Scikit-Learn, Caret and Weka)
The results of our study reveal that the predictions provided by the 3 libraries differ in 95% of the cases on average across a total of 105 cases studied.
- Score: 9.224578642189023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the past two decades, several Machine Learning (ML) libraries have become
freely available. Many studies have used such libraries to carry out empirical
investigations on predictive Software Engineering (SE) tasks. However, the
differences stemming from using one library over another have been overlooked,
implicitly assuming that using any of these libraries would provide the user
with the same or very similar results. This paper aims at raising awareness of
the differences incurred when using different ML libraries for software
development effort estimation (SEE), one of most widely studied SE prediction
tasks. To this end, we investigate 4 deterministic machine learners as provided
by 3 of the most popular ML open-source libraries written in different
languages (namely, Scikit-Learn, Caret and Weka). We carry out a thorough
empirical study comparing the performance of the machine learners on 5 SEE
datasets in the two most common SEE scenarios (i.e., out-of-the-box-ml and
tuned-ml) as well as an in-depth analysis of the documentation and code of
their APIs. The results of our study reveal that the predictions provided by
the 3 libraries differ in 95% of the cases on average across a total of 105
cases studied. These differences are significantly large in most cases and
yield misestimations of up to approx. 3,000 hours per project. Moreover, our
API analysis reveals that these libraries provide the user with different
levels of control on the parameters one can manipulate, and a lack of clarity
and consistency, overall, which might mislead users. Our findings highlight
that the ML library is an important design choice for SEE studies, which can
lead to a difference in performance. However, such a difference is
under-documented. We conclude by highlighting open-challenges with suggestions
for the developers of libraries as well as for the researchers and
practitioners using them.
Related papers
- Library Learning Doesn't: The Curious Case of the Single-Use "Library" [20.25809428140996]
We study two library learning systems for mathematics which both reported increased accuracy: LEGO-Prover and TroVE.
We find that function reuse is extremely infrequent on miniF2F and MATH.
Our followup experiments suggest that, rather than reuse, self-correction and self-consistency are the primary drivers of the observed performance gains.
arXiv Detail & Related papers (2024-10-26T21:05:08Z) - An Empirical Study of API Misuses of Data-Centric Libraries [9.667988837321943]
This paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization.
We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub.
arXiv Detail & Related papers (2024-08-28T15:15:52Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - LLMBox: A Comprehensive Library for Large Language Models [109.15654830320553]
This paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of large language models (LLMs)
This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets, and models, and (3) more practical consideration, especially on user-friendliness and efficiency.
arXiv Detail & Related papers (2024-07-08T02:39:33Z) - Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models [95.96734086126469]
Large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications.
For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work.
We perform a detailed coarse-to-fine analysis of the inference performance of various code libraries.
arXiv Detail & Related papers (2024-04-17T15:57:50Z) - Lightweight Syntactic API Usage Analysis with UCov [0.0]
We present a novel conceptual framework designed to assist library maintainers in understanding the interactions allowed by their APIs.
These customizable models enable library maintainers to improve their design ahead of release, reducing friction during evolution.
We implement these models for Java libraries in a new tool UCov and demonstrate its capabilities on three libraries exhibiting diverse styles of interaction.
arXiv Detail & Related papers (2024-02-19T10:33:41Z) - Evaluating In-Context Learning of Libraries for Code Generation [35.57902679044737]
Large Language Models (LLMs) exhibit a high degree of code generation and comprehension capability.
Recent work has shown that large proprietary LLMs can learn novel library usage in-context from demonstrations.
arXiv Detail & Related papers (2023-11-16T07:37:25Z) - Large Language Model-Aware In-Context Learning for Code Generation [75.68709482932903]
Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation.
We propose a novel learning-based selection approach named LAIL (LLM-Aware In-context Learning) for code generation.
arXiv Detail & Related papers (2023-10-15T06:12:58Z) - SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks.
It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches.
We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z) - An Empirical Study of Library Usage and Dependency in Deep Learning
Frameworks [12.624032509149869]
pytorch, Caffe, and Scikit-learn are the most frequent combination in 18% and 14% of the projects.
The developer uses two or three dl libraries in the same projects and tends to use different multiple dl libraries in both the same function and the same files.
arXiv Detail & Related papers (2022-11-28T19:31:56Z) - LibFewShot: A Comprehensive Library for Few-shot Learning [78.58842209282724]
Few-shot learning, especially few-shot image classification, has received increasing attention and witnessed significant advances in recent years.
Some recent studies implicitly show that many generic techniques or tricks, such as data augmentation, pre-training, knowledge distillation, and self-supervision, may greatly boost the performance of a few-shot learning method.
We propose a comprehensive library for few-shot learning (LibFewShot) by re-implementing seventeen state-of-the-art few-shot learning methods in a unified framework with the same single intrinsic in PyTorch.
arXiv Detail & Related papers (2021-09-10T14:12:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.