Fault Localization in Deep Learning-based Software: A System-level Approach
- URL: http://arxiv.org/abs/2411.08172v1
- Date: Tue, 12 Nov 2024 20:32:36 GMT
- Title: Fault Localization in Deep Learning-based Software: A System-level Approach
- Authors: Mohammad Mehdi Morovati, Amin Nikanjam, Foutse Khomh,
- Abstract summary: We introduce FL4Deep, a system-level fault localization approach considering the entire Deep Learning development pipeline.
In an evaluation using 100 faulty DL scripts, FL4Deep outperformed four previous approaches in terms of accuracy for three out of six DL-related faults.
- Score: 12.546853096298175
- License:
- Abstract: Over the past decade, Deep Learning (DL) has become an integral part of our daily lives. This surge in DL usage has heightened the need for developing reliable DL software systems. Given that fault localization is a critical task in reliability assessment, researchers have proposed several fault localization techniques for DL-based software, primarily focusing on faults within the DL model. While the DL model is central to DL components, there are other elements that significantly impact the performance of DL components. As a result, fault localization methods that concentrate solely on the DL model overlook a large portion of the system. To address this, we introduce FL4Deep, a system-level fault localization approach considering the entire DL development pipeline to effectively localize faults across the DL-based systems. In an evaluation using 100 faulty DL scripts, FL4Deep outperformed four previous approaches in terms of accuracy for three out of six DL-related faults, including issues related to data (84%), mismatched libraries between training and deployment (100%), and loss function (69%). Additionally, FL4Deep demonstrated superior precision and recall in fault localization for five categories of faults including three mentioned fault types in terms of accuracy, plus insufficient training iteration and activation function.
Related papers
- Enhancing Fault Localization Through Ordered Code Analysis with LLM Agents and Self-Reflection [8.22737389683156]
Large Language Models (LLMs) offer promising improvements in fault localization by enhancing code comprehension and reasoning.
We introduce LLM4FL, a novel LLM-agent-based fault localization approach that integrates SBFL rankings with a divide-and-conquer strategy.
Our results demonstrate that LLM4FL outperforms AutoFL by 19.27% in Top-1 accuracy and surpasses state-of-the-art supervised techniques such as DeepFL and Grace.
arXiv Detail & Related papers (2024-09-20T16:47:34Z) - AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models [95.09157454599605]
Large Language Models (LLMs) are becoming increasingly powerful, but they still exhibit significant but subtle weaknesses.
Traditional benchmarking approaches cannot thoroughly pinpoint specific model deficiencies.
We introduce a unified framework, AutoDetect, to automatically expose weaknesses in LLMs across various tasks.
arXiv Detail & Related papers (2024-06-24T15:16:45Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - DOMINO: Domain-aware Loss for Deep Learning Calibration [49.485186880996125]
This paper proposes a novel domain-aware loss function to calibrate deep learning models.
The proposed loss function applies a class-wise penalty based on the similarity between classes within a given target domain.
arXiv Detail & Related papers (2023-02-10T09:47:46Z) - DeepFD: Automated Fault Diagnosis and Localization for Deep Learning
Programs [15.081278640511998]
DeepFD is a learning-based fault diagnosis and localization framework.
It maps the fault localization task to a learning problem.
It correctly diagnoses 52% faulty DL programs, compared with around half (27%) achieved by the best state-of-the-art works.
arXiv Detail & Related papers (2022-05-04T08:15:56Z) - Challenges in Migrating Imperative Deep Learning Programs to Graph
Execution: An Empirical Study [4.415977307120617]
We conduct a data-driven analysis of challenges -- and resultant bugs -- involved in writing reliable yet performant imperative DL code.
We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code.
arXiv Detail & Related papers (2022-01-24T21:12:38Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Characterizing Performance Bugs in Deep Learning Systems [7.245989243616551]
We present the first comprehensive study to characterize symptoms, root causes, and exposing stages of performance bugs in deep learning systems.
Our findings shed light on the implications on developing high performance DL systems, and detecting and localizing PBs in DL systems.
We also build the first benchmark of 56 PBs in DL systems, and assess the capability of existing approaches in tackling them.
arXiv Detail & Related papers (2021-12-03T08:08:52Z) - Deep Learning and Traffic Classification: Lessons learned from a
commercial-grade dataset with hundreds of encrypted and zero-day applications [72.02908263225919]
We share our experience on a commercial-grade DL traffic classification engine.
We identify known applications from encrypted traffic, as well as unknown zero-day applications.
We propose a novel technique, tailored for DL models, that is significantly more accurate and light-weight than the state of the art.
arXiv Detail & Related papers (2021-04-07T15:21:22Z) - An Empirical Study on Deployment Faults of Deep Learning Based Mobile
Applications [7.58063287182615]
Mobile Deep Learning (DL) apps integrate DL models trained using large-scale data with DL programs.
This paper presents the first comprehensive study on the deployment faults of mobile DL apps.
We construct a fine-granularity taxonomy consisting of 23 categories regarding to fault symptoms and distill common fix strategies for different fault types.
arXiv Detail & Related papers (2021-01-13T08:19:50Z) - A Survey of Deep Active Learning [54.376820959917005]
Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples.
Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters.
Deep active learning (DAL) has emerged.
arXiv Detail & Related papers (2020-08-30T04:28:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.