An Empirical Study on Bugs Inside PyTorch: A Replication Study
- URL: http://arxiv.org/abs/2307.13777v2
- Date: Tue, 1 Aug 2023 16:52:12 GMT
- Title: An Empirical Study on Bugs Inside PyTorch: A Replication Study
- Authors: Sharon Chee Yin Ho and Vahid Majdinasab and Mohayeminul Islam and
Diego Elias Costa and Emad Shihab and Foutse Khomh and Sarah Nadi and
Muhammad Raza
- Abstract summary: We characterize bugs in the PyTorch library, a very popular deep learning framework.
Our results highlight that PyTorch bugs are more like traditional software projects bugs, than related to deep learning characteristics.
- Score: 10.848682558737494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Software systems are increasingly relying on deep learning components, due to
their remarkable capability of identifying complex data patterns and powering
intelligent behaviour. A core enabler of this change in software development is
the availability of easy-to-use deep learning libraries. Libraries like PyTorch
and TensorFlow empower a large variety of intelligent systems, offering a
multitude of algorithms and configuration options, applicable to numerous
domains of systems. However, bugs in those popular deep learning libraries also
may have dire consequences for the quality of systems they enable; thus, it is
important to understand how bugs are identified and fixed in those libraries.
Inspired by a study of Jia et al., which investigates the bug identification
and fixing process at TensorFlow, we characterize bugs in the PyTorch library,
a very popular deep learning framework. We investigate the causes and symptoms
of bugs identified during PyTorch's development, and assess their locality
within the project, and extract patterns of bug fixes. Our results highlight
that PyTorch bugs are more like traditional software projects bugs, than
related to deep learning characteristics. Finally, we also compare our results
with the study on TensorFlow, highlighting similarities and differences across
the bug identification and fixing process.
Related papers
- Towards Understanding the Challenges of Bug Localization in Deep
Learning Systems [2.9312156642007294]
We conduct a large-scale empirical study to better understand the challenges of localizing bugs in deep-learning systems.
First, we determine the bug localization performance of four existing techniques using 2,365 bugs from deep-learning systems and 2,913 from traditional software.
Second, we evaluate how different bug types in deep learning systems impact bug localization.
arXiv Detail & Related papers (2024-02-01T21:17:42Z) - Towards Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study [13.17302533571231]
This paper examines the critical issue of reproducing deep learning bugs.
We identify edit actions and useful information that could improve the critical issue.
We successfully reproduced 148 out of 165 bugs attempted.
arXiv Detail & Related papers (2024-01-05T21:30:13Z) - Software issues report for bug fixing process: An empirical study of
machine-learning libraries [0.0]
We investigated the effectiveness of issue resolution for bug-fixing processes in six machine-learning libraries.
The most common categories of issues that arise in machine-learning libraries are bugs, documentation, optimization, crashes, enhancement, new feature requests, build/CI, support, and performance.
This study concludes that efficient issue-tracking processes, effective communication, and collaboration are vital for effective resolution of issues and bug fixing processes in machine-learning libraries.
arXiv Detail & Related papers (2023-12-10T21:33:19Z) - Less is More? An Empirical Study on Configuration Issues in Python PyPI
Ecosystem [38.44692482370243]
Python is widely used in the open-source community, largely owing to the extensive support from diverse third-party libraries.
Third-party libraries can potentially lead to conflicts in dependencies, prompting researchers to develop dependency conflict detectors.
endeavors have been made to automatically infer dependencies.
arXiv Detail & Related papers (2023-10-19T09:07:51Z) - Automatic Static Bug Detection for Machine Learning Libraries: Are We
There Yet? [14.917820383894124]
We analyze five popular and widely used static bug detectors, i.e., Flawfinder, RATS, Cppcheck, Facebook Infer, and Clang, on a curated dataset of software bugs.
Overall, our study shows that static bug detectors find a negligible amount of all bugs accounting for 6/410 bugs (0.01%), Flawfinder and RATS are the most effective static checker for finding software bugs in machine learning libraries.
arXiv Detail & Related papers (2023-07-09T01:38:52Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks.
It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches.
We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem.
The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network.
To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z) - LibFewShot: A Comprehensive Library for Few-shot Learning [78.58842209282724]
Few-shot learning, especially few-shot image classification, has received increasing attention and witnessed significant advances in recent years.
Some recent studies implicitly show that many generic techniques or tricks, such as data augmentation, pre-training, knowledge distillation, and self-supervision, may greatly boost the performance of a few-shot learning method.
We propose a comprehensive library for few-shot learning (LibFewShot) by re-implementing seventeen state-of-the-art few-shot learning methods in a unified framework with the same single intrinsic in PyTorch.
arXiv Detail & Related papers (2021-09-10T14:12:37Z) - Captum: A unified and generic model interpretability library for PyTorch [49.72749684393332]
We introduce a novel, unified, open-source model interpretability library for PyTorch.
The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms.
It can be used for both classification and non-classification models.
arXiv Detail & Related papers (2020-09-16T18:57:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.