Related papers: Large-scale Crash Localization using Multi-Task Learning

Large-scale Crash Localization using Multi-Task Learning

URL: http://arxiv.org/abs/2109.14326v1
Date: Wed, 29 Sep 2021 10:26:57 GMT
Title: Large-scale Crash Localization using Multi-Task Learning
Authors: Manish Shetty, Chetan Bansal, Suman Nath, Sean Bowles, Henry Wang, Ozgur Arman, Siamak Ahari
Abstract summary: We develop a novel multi-task sequence labeling approach for identifying blamed frames in stack traces. We evaluate our model with over a million real-world crashes from four popular Microsoft applications.
Score: 3.4383679424643456
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Crash localization, an important step in debugging crashes, is challenging when dealing with an extremely large number of diverse applications and platforms and underlying root causes. Large-scale error reporting systems, e.g., Windows Error Reporting (WER), commonly rely on manually developed rules and heuristics to localize blamed frames causing the crashes. As new applications and features are routinely introduced and existing applications are run under new environments, developing new rules and maintaining existing ones become extremely challenging. We propose a data-driven solution to address the problem. We start with the first large-scale empirical study of 362K crashes and their blamed methods reported to WER by tens of thousands of applications running in the field. The analysis provides valuable insights on where and how the crashes happen and what methods to blame for the crashes. These insights enable us to develop DeepAnalyze, a novel multi-task sequence labeling approach for identifying blamed frames in stack traces. We evaluate our model with over a million real-world crashes from four popular Microsoft applications and show that DeepAnalyze, trained with crashes from one set of applications, not only accurately localizes crashes of the same applications, but also bootstraps crash localization for other applications with zero to very little additional training data.

Related papers

DaiFu: In-Situ Crash Recovery for Deep Learning Systems [54.52831889359226]
We present DaiFu, an in-situ recovery framework for deep learning (DL) systems.<n>DaiFu augments it to intercept crashes in situ and enables dynamic and instant updates to its program running context.<n>Our evaluation shows that DaiFu helps reduce the restore time for crash recovery, achieving a 1372x speedup compared with state-of-the-art solutions.
arXiv Detail & Related papers (2025-07-02T11:58:38Z)
CrashFixer: A crash resolution agent for the Linux kernel [58.152358195983155]
This work builds upon kGym, which shares a benchmark for system-level Linux kernel bugs and a platform to run experiments on the Linux kernel. This paper introduces CrashFixer, the first LLM-based software repair agent that is applicable to Linux kernel bugs.
arXiv Detail & Related papers (2025-04-29T04:18:51Z)
Scalable and Accurate Application-Level Crash-Consistency Testing via Representative Testing [4.659174681934402]
We build Pathfinder, a crash-consistency testing tool that implements an update behaviors-based to approximate a small set of representative crash states. Pathfinder scales more effectively to large applications than prior works and finds 4x more bugs in POSIX-based applications and 8x more bugs in MMIO-based applications.
arXiv Detail & Related papers (2025-03-03T10:41:57Z)
Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces [3.3158239079459655]
We present a novel approach to localize faults based only on the stack trace information and no additional runtime information. By fine-tuning on 64,369 crashes resulting from 4.1 million mutations of the code base, we can correctly predict the root cause location of a crash with an accuracy of 66.9%.
arXiv Detail & Related papers (2025-01-29T21:40:32Z)
Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization [12.103194723136406]
We propose an explainable crashing fault localization approach by combining static analysis and LLM techniques. Our primary insight is that understanding the semantics of exception-throwing statements in the framework code can help find and apprehend the buggy methods in the app code. Based on this idea, first, we design the exception-thrown summary (ETS) that describes the key elements related to each framework-specific exception. Then we make data-tracking of its key elements to identify and sort buggy candidates for the given crash.
arXiv Detail & Related papers (2024-08-22T02:18:35Z)
KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks. In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel. To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
Crash Report Accumulation During Continuous Fuzzing [0.0]
We propose a crash accumulation method and implement it as part of the CASR toolset. We evaluate our approach on crash reports collected from fuzzing results.
arXiv Detail & Related papers (2024-05-28T13:36:31Z)
The Impact Of Bug Localization Based on Crash Report Mining: A Developers' Perspective [7.952391285456257]
We report our experience of using an approach for grouping crash reports and finding buggy code on a weekly basis for 18 months. The approach investigated in this study correctly suggested the buggy file most of the time -- the approach's precision was around 80%.
arXiv Detail & Related papers (2024-03-16T01:23:01Z)
Resolving Crash Bugs via Large Language Models: An Empirical Study [20.32724670868432]
Crash bugs cause unexpected program behaviors or even termination, requiring high-priority resolution. ChatGPT, a recent large language model (LLM), has garnered significant attention due to its exceptional performance across various domains. This work performs the first investigation into ChatGPT's capability in resolve real-world crash bugs, focusing on its effectiveness in both localizing and repairing code-related and environment-related crash bugs.
arXiv Detail & Related papers (2023-12-16T13:41:04Z)
CrashTranslator: Automatically Reproducing Mobile Application Crashes Directly from Stack Trace [30.48737611250448]
This paper proposes an approach named CrashTranslator to automatically reproduce mobile application crashes directly from the stack trace. We evaluate CrashTranslator on 75 crash reports involving 58 popular Android apps, and it successfully reproduces 61.3% of the crashes.
arXiv Detail & Related papers (2023-10-11T02:00:18Z)
Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults. We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z)
Do Different Tracking Tasks Require Different Appearance Models? [118.02175542476367]
We present UniTrack, a unified tracking solution to address five different tasks within the same framework. UniTrack consists of a single and task-agnostic appearance model, which can be learned in a supervised or self-supervised fashion. We show how most tracking tasks can be solved within this framework, and that the same appearance model can be used to obtain performance that is competitive against specialised methods for all the five tasks considered.
arXiv Detail & Related papers (2021-07-05T17:40:17Z)
A Background-Agnostic Framework with Adversarial Training for Abnormal Event Detection in Video [120.18562044084678]
Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years. We propose a background-agnostic framework that learns from training videos containing only normal events.
arXiv Detail & Related papers (2020-08-27T18:39:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.