Large-scale Crash Localization using Multi-Task Learning
- URL: http://arxiv.org/abs/2109.14326v1
- Date: Wed, 29 Sep 2021 10:26:57 GMT
- Title: Large-scale Crash Localization using Multi-Task Learning
- Authors: Manish Shetty, Chetan Bansal, Suman Nath, Sean Bowles, Henry Wang,
Ozgur Arman, Siamak Ahari
- Abstract summary: We develop a novel multi-task sequence labeling approach for identifying blamed frames in stack traces.
We evaluate our model with over a million real-world crashes from four popular Microsoft applications.
- Score: 3.4383679424643456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crash localization, an important step in debugging crashes, is challenging
when dealing with an extremely large number of diverse applications and
platforms and underlying root causes. Large-scale error reporting systems,
e.g., Windows Error Reporting (WER), commonly rely on manually developed rules
and heuristics to localize blamed frames causing the crashes. As new
applications and features are routinely introduced and existing applications
are run under new environments, developing new rules and maintaining existing
ones become extremely challenging. We propose a data-driven solution to address
the problem. We start with the first large-scale empirical study of 362K
crashes and their blamed methods reported to WER by tens of thousands of
applications running in the field. The analysis provides valuable insights on
where and how the crashes happen and what methods to blame for the crashes.
These insights enable us to develop DeepAnalyze, a novel multi-task sequence
labeling approach for identifying blamed frames in stack traces. We evaluate
our model with over a million real-world crashes from four popular Microsoft
applications and show that DeepAnalyze, trained with crashes from one set of
applications, not only accurately localizes crashes of the same applications,
but also bootstraps crash localization for other applications with zero to very
little additional training data.
Related papers
- Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization [12.103194723136406]
We propose an explainable crashing fault localization approach by combining static analysis and LLM techniques.
Our primary insight is that understanding the semantics of exception-throwing statements in the framework code can help find and apprehend the buggy methods in the app code.
Based on this idea, first, we design the exception-thrown summary (ETS) that describes the key elements related to each framework-specific exception.
Then we make data-tracking of its key elements to identify and sort buggy candidates for the given crash.
arXiv Detail & Related papers (2024-08-22T02:18:35Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports.
We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes.
Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Crash Report Accumulation During Continuous Fuzzing [0.0]
We propose a crash accumulation method and implement it as part of the CASR toolset.
We evaluate our approach on crash reports collected from fuzzing results.
arXiv Detail & Related papers (2024-05-28T13:36:31Z) - The Impact Of Bug Localization Based on Crash Report Mining: A Developers' Perspective [7.952391285456257]
We report our experience of using an approach for grouping crash reports and finding buggy code on a weekly basis for 18 months.
The approach investigated in this study correctly suggested the buggy file most of the time -- the approach's precision was around 80%.
arXiv Detail & Related papers (2024-03-16T01:23:01Z) - Resolving Crash Bugs via Large Language Models: An Empirical Study [20.32724670868432]
Crash bugs cause unexpected program behaviors or even termination, requiring high-priority resolution.
ChatGPT, a recent large language model (LLM), has garnered significant attention due to its exceptional performance across various domains.
This work performs the first investigation into ChatGPT's capability in resolve real-world crash bugs, focusing on its effectiveness in both localizing and repairing code-related and environment-related crash bugs.
arXiv Detail & Related papers (2023-12-16T13:41:04Z) - CrashTranslator: Automatically Reproducing Mobile Application Crashes
Directly from Stack Trace [30.48737611250448]
This paper proposes an approach named CrashTranslator to automatically reproduce mobile application crashes directly from the stack trace.
We evaluate CrashTranslator on 75 crash reports involving 58 popular Android apps, and it successfully reproduces 61.3% of the crashes.
arXiv Detail & Related papers (2023-10-11T02:00:18Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - Do Different Tracking Tasks Require Different Appearance Models? [118.02175542476367]
We present UniTrack, a unified tracking solution to address five different tasks within the same framework.
UniTrack consists of a single and task-agnostic appearance model, which can be learned in a supervised or self-supervised fashion.
We show how most tracking tasks can be solved within this framework, and that the same appearance model can be used to obtain performance that is competitive against specialised methods for all the five tasks considered.
arXiv Detail & Related papers (2021-07-05T17:40:17Z) - A Background-Agnostic Framework with Adversarial Training for Abnormal
Event Detection in Video [120.18562044084678]
Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years.
We propose a background-agnostic framework that learns from training videos containing only normal events.
arXiv Detail & Related papers (2020-08-27T18:39:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.