Related papers: Demystifying Dependency Bugs in Deep Learning Stack

Demystifying Dependency Bugs in Deep Learning Stack

URL: http://arxiv.org/abs/2207.10347v2
Date: Fri, 1 Sep 2023 16:54:38 GMT
Title: Demystifying Dependency Bugs in Deep Learning Stack
Authors: Kaifeng Huang, Bihuan Chen, Susheng Wu, Junmin Cao, Lei Ma, Xin Peng
Abstract summary: This paper characterizes symptoms, root causes and fix patterns of dependency bugs (DBs) across the whole Deep Learning stack. Our findings shed light on practical implications on dependency management.
Score: 7.488059560714949
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning (DL) applications, built upon a heterogeneous and complex DL stack (e.g., Nvidia GPU, Linux, CUDA driver, Python runtime, and TensorFlow), are subject to software and hardware dependencies across the DL stack. One challenge in dependency management across the entire engineering lifecycle is posed by the asynchronous and radical evolution and the complex version constraints among dependencies. Developers may introduce dependency bugs (DBs) in selecting, using and maintaining dependencies. However, the characteristics of DBs in DL stack is still under-investigated, hindering practical solutions to dependency management in DL stack. To bridge this gap, this paper presents the first comprehensive study to characterize symptoms, root causes and fix patterns of DBs across the whole DL stack with 446 DBs collected from StackOverflow posts and GitHub issues. For each DB, we first investigate the symptom as well as the lifecycle stage and dependency where the symptom is exposed. Then, we analyze the root cause as well as the lifecycle stage and dependency where the root cause is introduced. Finally, we explore the fix pattern and the knowledge sources that are used to fix it. Our findings from this study shed light on practical implications on dependency management.

Related papers

Insights into Dependency Maintenance Trends in the Maven Ecosystem [0.14999444543328289]
We present a quantitative analysis of the Neo4j dataset using the Goblin framework. Our analysis reveals that releases with fewer dependencies have a higher number of missed releases. Our study shows that the dependencies in the latest releases have positive freshness scores, indicating better software management efficacy.
arXiv Detail & Related papers (2025-03-28T22:20:24Z)
ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution. We show that ExploraCoder significantly improves performance for models lacking prior API knowledge, achieving an absolute increase of 11.24% over niave RAG approaches and 14.07% over pretraining methods in pass@10.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
An Overview and Catalogue of Dependency Challenges in Open Source Software Package Registries [52.23798016734889]
This article provides a catalogue of dependency-related challenges that come with relying on OSS packages or libraries. The catalogue is based on the scientific literature on empirical research that has been conducted to understand, quantify and overcome these challenges.
arXiv Detail & Related papers (2024-09-27T16:20:20Z)
A Preliminary Study on Self-Contained Libraries in the NPM Ecosystem [2.221643499902673]
The widespread of libraries within modern software ecosystems creates complex networks of dependencies. One mitigation strategy involves reducing dependencies; libraries with zero dependencies become to self-contained. This paper explores the characteristics of self-contained libraries within the NPM ecosystem.
arXiv Detail & Related papers (2024-06-17T09:33:49Z)
How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE) We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories. To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
An empirical study of bloated dependencies in CommonJS packages [6.115666382910127]
We conduct an empirical study to investigate the bloated dependencies that are entirely unused within server-side applications. We propose a trace-based dynamic analysis that monitors file access, to determine which dependencies are not accessed during runtime. Our findings suggest that native support for dependency debloating in package managers could significantly alleviate the burden of maintaining dependencies.
arXiv Detail & Related papers (2024-05-28T08:04:01Z)
See to Believe: Using Visualization To Motivate Updating Third-party Dependencies [1.7914660044009358]
Security vulnerabilities introduced by applications using third-party dependencies are on the increase. Developers are wary of library updates, even to fix vulnerabilities, citing that being unaware, or that the migration effort to update outweighs the decision. In this paper, we hypothesize that the dependency graph visualization (DGV) approach will motivate developers to update.
arXiv Detail & Related papers (2024-05-15T03:57:27Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
Less is More? An Empirical Study on Configuration Issues in Python PyPI Ecosystem [38.44692482370243]
Python is widely used in the open-source community, largely owing to the extensive support from diverse third-party libraries. Third-party libraries can potentially lead to conflicts in dependencies, prompting researchers to develop dependency conflict detectors. endeavors have been made to automatically infer dependencies.
arXiv Detail & Related papers (2023-10-19T09:07:51Z)
PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps) It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z)
A Data Source Dependency Analysis Framework for Large Scale Data Science Projects [0.0]
Data source dependency hell refers to the central role played by data and its unique quirks that often lead to unexpected failures of machine learning models. We present an automated dependency mapping framework that allows MLOps engineers to monitor the whole dependency map of their models in a fast paced engineering environment.
arXiv Detail & Related papers (2022-12-15T16:34:39Z)
Deep Transfer Learning for Multi-source Entity Linkage via Domain Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching. AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage. Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.