Related papers: Characterizing Performance Bugs in Deep Learning Systems

Characterizing Performance Bugs in Deep Learning Systems

URL: http://arxiv.org/abs/2112.01771v1
Date: Fri, 3 Dec 2021 08:08:52 GMT
Title: Characterizing Performance Bugs in Deep Learning Systems
Authors: Junming Cao, Bihuan Chen, Chao Sun, Longjie Hu, Xin Peng
Abstract summary: We present the first comprehensive study to characterize symptoms, root causes, and exposing stages of performance bugs in deep learning systems. Our findings shed light on the implications on developing high performance DL systems, and detecting and localizing PBs in DL systems. We also build the first benchmark of 56 PBs in DL systems, and assess the capability of existing approaches in tackling them.
Score: 7.245989243616551
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning (DL) has been increasingly applied to a variety of domains. The programming paradigm shift from traditional systems to DL systems poses unique challenges in engineering DL systems. Performance is one of the challenges, and performance bugs(PBs) in DL systems can cause severe consequences such as excessive resource consumption and financial loss. While bugs in DL systems have been extensively investigated, PBs in DL systems have hardly been explored. To bridge this gap, we present the first comprehensive study to characterize symptoms, root causes, and introducing and exposing stages of PBs in DL systems developed in TensorFLow and Keras, with a total of 238 PBs collected from 225 StackOverflow posts. Our findings shed light on the implications on developing high performance DL systems, and detecting and localizing PBs in DL systems. We also build the first benchmark of 56 PBs in DL systems, and assess the capability of existing approaches in tackling them. Moreover, we develop a static checker DeepPerf to detect three types of PBs, and identify 488 new PBs in 130 GitHub projects.62 and 18 of them have been respectively confirmed and fixed by developers.

Related papers

Deep-Bench: Deep Learning Benchmark Dataset for Code Generation [2.897621520197328]
DeepBench is a novel benchmark dataset for function-level Deep learning code generation. GPT-4o -- the state-of-the-art LLM -- achieved 31% accuracy on DeepBench, significantly lower than its 60% on DS-1000. DeepBench offers valuable insights into the LLMs' performance and areas for potential improvement in the DL domain.
arXiv Detail & Related papers (2025-02-26T00:43:50Z)
A New Perspective on Time Series Anomaly Detection: Faster Patch-based Broad Learning System [59.38402187365612]
Time series anomaly detection (TSAD) has been a research hotspot in both academia and industry in recent years. Deep learning is not required for TSAD due to limitations such as slow deep learning speed. We propose Contrastive Patch-based Broad Learning System (CBLS)
arXiv Detail & Related papers (2024-12-07T01:58:18Z)
Fault Localization in Deep Learning-based Software: A System-level Approach [12.546853096298175]
We introduce FL4Deep, a system-level fault localization approach considering the entire Deep Learning development pipeline. In an evaluation using 100 faulty DL scripts, FL4Deep outperformed four previous approaches in terms of accuracy for three out of six DL-related faults.
arXiv Detail & Related papers (2024-11-12T20:32:36Z)
Robustness and Generalization Performance of Deep Learning Models on Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise. We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z)
Tackling Long-Tailed Category Distribution Under Domain Shifts [50.21255304847395]
Existing approaches cannot handle the scenario where both issues exist. We designed three novel core functional blocks including Distribution Calibrated Classification Loss, Visual-Semantic Mapping and Semantic-Similarity Guided Augmentation. Two new datasets were proposed for this problem, named AWA2-LTS and ImageNet-LTS.
arXiv Detail & Related papers (2022-07-20T19:07:46Z)
DeepFD: Automated Fault Diagnosis and Localization for Deep Learning Programs [15.081278640511998]
DeepFD is a learning-based fault diagnosis and localization framework. It maps the fault localization task to a learning problem. It correctly diagnoses 52% faulty DL programs, compared with around half (27%) achieved by the best state-of-the-art works.
arXiv Detail & Related papers (2022-05-04T08:15:56Z)
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study [4.415977307120617]
We conduct a data-driven analysis of challenges -- and resultant bugs -- involved in writing reliable yet performant imperative DL code. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code.
arXiv Detail & Related papers (2022-01-24T21:12:38Z)
Deep Learning-based Implicit CSI Feedback in Massive MIMO [68.81204537021821]
We propose a DL-based implicit feedback architecture to inherit the low-overhead characteristic, which uses neural networks (NNs) to replace the precoding matrix indicator (PMI) encoding and decoding modules. For a single resource block (RB), the proposed architecture can save 25.0% and 40.0% of overhead compared with Type I codebook under two antenna configurations.
arXiv Detail & Related papers (2021-05-21T02:43:02Z)
EXPLAINABOARD: An Explainable Leaderboard for NLP [69.59340280972167]
ExplainaBoard is a new conceptualization and implementation of NLP evaluation. It allows researchers to (i) diagnose strengths and weaknesses of a single system and (ii) interpret relationships between multiple systems.
arXiv Detail & Related papers (2021-04-13T17:45:50Z)
An Empirical Study on Deployment Faults of Deep Learning Based Mobile Applications [7.58063287182615]
Mobile Deep Learning (DL) apps integrate DL models trained using large-scale data with DL programs. This paper presents the first comprehensive study on the deployment faults of mobile DL apps. We construct a fine-granularity taxonomy consisting of 23 categories regarding to fault symptoms and distill common fix strategies for different fault types.
arXiv Detail & Related papers (2021-01-13T08:19:50Z)
A Survey of Deep Active Learning [54.376820959917005]
Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples. Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters. Deep active learning (DAL) has emerged.
arXiv Detail & Related papers (2020-08-30T04:28:31Z)
Model-based Exploration of the Frontier of Behaviours for Deep Learning System Testing [4.632232395989182]
Deep Learning (DL) systems produce an output for any arbitrary numeric vector provided as input, regardless of whether it is within or outside the validity domain of the system under test. In this paper, we introduce the notion of frontier of behaviours, i.e., the inputs at which the DL system starts to misbehave. We developed DeepJanus, a search-based tool that generates frontier inputs for DL systems.
arXiv Detail & Related papers (2020-07-06T14:42:11Z)
Data Mining with Big Data in Intrusion Detection Systems: A Systematic Literature Review [68.15472610671748]
Cloud computing has become a powerful and indispensable technology for complex, high performance and scalable computation. The rapid rate and volume of data creation has begun to pose significant challenges for data management and security. The design and deployment of intrusion detection systems (IDS) in the big data setting has, therefore, become a topic of importance.
arXiv Detail & Related papers (2020-05-23T20:57:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.