Related papers: An Empirical Study on Bugs Inside PyTorch: A Replication Study

An Empirical Study on Bugs Inside PyTorch: A Replication Study

URL: http://arxiv.org/abs/2307.13777v2
Date: Tue, 1 Aug 2023 16:52:12 GMT
Title: An Empirical Study on Bugs Inside PyTorch: A Replication Study
Authors: Sharon Chee Yin Ho and Vahid Majdinasab and Mohayeminul Islam and Diego Elias Costa and Emad Shihab and Foutse Khomh and Sarah Nadi and Muhammad Raza
Abstract summary: We characterize bugs in the PyTorch library, a very popular deep learning framework. Our results highlight that PyTorch bugs are more like traditional software projects bugs, than related to deep learning characteristics.
Score: 10.848682558737494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler of this change in software development is the availability of easy-to-use deep learning libraries. Libraries like PyTorch and TensorFlow empower a large variety of intelligent systems, offering a multitude of algorithms and configuration options, applicable to numerous domains of systems. However, bugs in those popular deep learning libraries also may have dire consequences for the quality of systems they enable; thus, it is important to understand how bugs are identified and fixed in those libraries. Inspired by a study of Jia et al., which investigates the bug identification and fixing process at TensorFlow, we characterize bugs in the PyTorch library, a very popular deep learning framework. We investigate the causes and symptoms of bugs identified during PyTorch's development, and assess their locality within the project, and extract patterns of bug fixes. Our results highlight that PyTorch bugs are more like traditional software projects bugs, than related to deep learning characteristics. Finally, we also compare our results with the study on TensorFlow, highlighting similarities and differences across the bug identification and fixing process.

Related papers

Analyzing the Usage of Donation Platforms for PyPI Libraries [91.97201077607862]
This study analyzes the adoption of donation platforms in the PyPI ecosystem. GitHub Sponsors is the dominant platform, though many PyPI-listed links are outdated.
arXiv Detail & Related papers (2025-03-11T10:27:31Z)
Leveraging Data Characteristics for Bug Localization in Deep Learning Programs [21.563130049562357]
We propose Theia, which detects and localizes structural bugs in Deep Learning (DL) programs. Our results show that Theia successfully localizes 57/75 structural bugs in 40 buggy programs, whereas NeuraLint, a state-of-the-art approach capable of localizing structural bugs before training localizes 17/75 bugs.
arXiv Detail & Related papers (2024-12-08T01:52:06Z)
Towards Understanding the Challenges of Bug Localization in Deep Learning Systems [2.9312156642007294]
We conduct a large-scale empirical study to better understand the challenges of localizing bugs in deep-learning systems. First, we determine the bug localization performance of four existing techniques using 2,365 bugs from deep-learning systems and 2,913 from traditional software. Second, we evaluate how different bug types in deep learning systems impact bug localization.
arXiv Detail & Related papers (2024-02-01T21:17:42Z)
Towards Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study [13.17302533571231]
This paper examines the critical issue of reproducing deep learning bugs. We identify edit actions and useful information that could improve the critical issue. We successfully reproduced 148 out of 165 bugs attempted.
arXiv Detail & Related papers (2024-01-05T21:30:13Z)
Software issues report for bug fixing process: An empirical study of machine-learning libraries [0.0]
We investigated the effectiveness of issue resolution for bug-fixing processes in six machine-learning libraries. The most common categories of issues that arise in machine-learning libraries are bugs, documentation, optimization, crashes, enhancement, new feature requests, build/CI, support, and performance. This study concludes that efficient issue-tracking processes, effective communication, and collaboration are vital for effective resolution of issues and bug fixing processes in machine-learning libraries.
arXiv Detail & Related papers (2023-12-10T21:33:19Z)
Less is More? An Empirical Study on Configuration Issues in Python PyPI Ecosystem [38.44692482370243]
Python is widely used in the open-source community, largely owing to the extensive support from diverse third-party libraries. Third-party libraries can potentially lead to conflicts in dependencies, prompting researchers to develop dependency conflict detectors. endeavors have been made to automatically infer dependencies.
arXiv Detail & Related papers (2023-10-19T09:07:51Z)
Automatic Static Bug Detection for Machine Learning Libraries: Are We There Yet? [14.917820383894124]
We analyze five popular and widely used static bug detectors, i.e., Flawfinder, RATS, Cppcheck, Facebook Infer, and Clang, on a curated dataset of software bugs. Overall, our study shows that static bug detectors find a negligible amount of all bugs accounting for 6/410 bugs (0.01%), Flawfinder and RATS are the most effective static checker for finding software bugs in machine learning libraries.
arXiv Detail & Related papers (2023-07-09T01:38:52Z)
PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps) It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z)
SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks. It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches. We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z)
BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization. We provide a general benchmark with a diversity of real and synthetic Java bugs. We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z)
DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem. The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network. To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z)
LibFewShot: A Comprehensive Library for Few-shot Learning [78.58842209282724]
Few-shot learning, especially few-shot image classification, has received increasing attention and witnessed significant advances in recent years. Some recent studies implicitly show that many generic techniques or tricks, such as data augmentation, pre-training, knowledge distillation, and self-supervision, may greatly boost the performance of a few-shot learning method. We propose a comprehensive library for few-shot learning (LibFewShot) by re-implementing seventeen state-of-the-art few-shot learning methods in a unified framework with the same single intrinsic in PyTorch.
arXiv Detail & Related papers (2021-09-10T14:12:37Z)
Captum: A unified and generic model interpretability library for PyTorch [49.72749684393332]
We introduce a novel, unified, open-source model interpretability library for PyTorch. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms. It can be used for both classification and non-classification models.
arXiv Detail & Related papers (2020-09-16T18:57:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.