Software issues report for bug fixing process: An empirical study of
machine-learning libraries
- URL: http://arxiv.org/abs/2312.06005v1
- Date: Sun, 10 Dec 2023 21:33:19 GMT
- Title: Software issues report for bug fixing process: An empirical study of
machine-learning libraries
- Authors: Adekunle Ajibode, Dong Yunwei, Yang Hongji
- Abstract summary: We investigated the effectiveness of issue resolution for bug-fixing processes in six machine-learning libraries.
The most common categories of issues that arise in machine-learning libraries are bugs, documentation, optimization, crashes, enhancement, new feature requests, build/CI, support, and performance.
This study concludes that efficient issue-tracking processes, effective communication, and collaboration are vital for effective resolution of issues and bug fixing processes in machine-learning libraries.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Issue resolution and bug-fixing processes are essential in the development of
machine-learning libraries, similar to software development, to ensure
well-optimized functions. Understanding the issue resolution and bug-fixing
process of machine-learning libraries can help developers identify areas for
improvement and optimize their strategies for issue resolution and bug-fixing.
However, detailed studies on this topic are lacking. Therefore, we investigated
the effectiveness of issue resolution for bug-fixing processes in six
machine-learning libraries: Tensorflow, Keras, Theano, Pytorch, Caffe, and
Scikit-learn. We addressed seven research questions (RQs) using 16,921 issues
extracted from the GitHub repository via the GitHub Rest API. We employed
several quantitative methods of data analysis, including correlation, OLS
regression, percentage and frequency count, and heatmap to analyze the RQs. We
found the following through our empirical investigation: (1) The most common
categories of issues that arise in machine-learning libraries are bugs,
documentation, optimization, crashes, enhancement, new feature requests,
build/CI, support, and performance. (2) Effective strategies for addressing
these problems include fixing critical bugs, optimizing performance, and
improving documentation. (3) These categorized issues are related to testing
and runtime and are common among all six machine-learning libraries. (4)
Monitoring the total number of comments on issues can provide insights into the
duration of the issues. (5) It is crucial to strike a balance between
prioritizing critical issues and addressing other issues in a timely manner.
Therefore, this study concludes that efficient issue-tracking processes,
effective communication, and collaboration are vital for effective resolution
of issues and bug fixing processes in machine-learning libraries.
Related papers
- Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities.
Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z) - Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models [95.96734086126469]
Large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications.
For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work.
We perform a detailed coarse-to-fine analysis of the inference performance of various code libraries.
arXiv Detail & Related papers (2024-04-17T15:57:50Z) - An Empirical Study of Challenges in Machine Learning Asset Management [15.07444988262748]
Despite existing research, a significant knowledge gap remains in operational challenges like model versioning, data traceability, and collaboration.
Our study aims to address this gap by analyzing 15,065 posts from developer forums and platforms.
We uncover 133 topics related to asset management challenges, grouped into 16 macro-topics, with software dependency, model deployment, and model training being the most discussed.
arXiv Detail & Related papers (2024-02-25T05:05:52Z) - Leveraging Print Debugging to Improve Code Generation in Large Language
Models [63.63160583432348]
Large language models (LLMs) have made significant progress in code generation tasks.
But their performance in tackling programming problems with complex data structures and algorithms remains suboptimal.
We propose an in-context learning approach that guides LLMs to debug by using a "print debug" method.
arXiv Detail & Related papers (2024-01-10T18:37:59Z) - An Empirical Study on Bugs Inside PyTorch: A Replication Study [10.848682558737494]
We characterize bugs in the PyTorch library, a very popular deep learning framework.
Our results highlight that PyTorch bugs are more like traditional software projects bugs, than related to deep learning characteristics.
arXiv Detail & Related papers (2023-07-25T19:23:55Z) - Automatic Static Bug Detection for Machine Learning Libraries: Are We
There Yet? [14.917820383894124]
We analyze five popular and widely used static bug detectors, i.e., Flawfinder, RATS, Cppcheck, Facebook Infer, and Clang, on a curated dataset of software bugs.
Overall, our study shows that static bug detectors find a negligible amount of all bugs accounting for 6/410 bugs (0.01%), Flawfinder and RATS are the most effective static checker for finding software bugs in machine learning libraries.
arXiv Detail & Related papers (2023-07-09T01:38:52Z) - LibAUC: A Deep Learning Library for X-Risk Optimization [43.32145407575245]
This paper introduces the award-winning deep learning (DL) library called LibAUC.
LibAUC implements state-of-the-art algorithms towards optimizing a family of risk functions named X-risks.
arXiv Detail & Related papers (2023-06-05T17:43:46Z) - SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks.
It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches.
We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - What to Prioritize? Natural Language Processing for the Development of a
Modern Bug Tracking Solution in Hardware Development [0.0]
We present an approach to predict the time to fix, the risk and the complexity of a bug report using different supervised machine learning algorithms.
The evaluation shows that a combination of text embeddings generated through the Universal Sentence model outperforms all other methods.
arXiv Detail & Related papers (2021-09-28T15:55:10Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.