Related papers: Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization

Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization

URL: http://arxiv.org/abs/2408.12070v1
Date: Thu, 22 Aug 2024 02:18:35 GMT
Title: Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization
Authors: Jiwei Yan, Jinhao Huang, Chunrong Fang, Jun Yan, Jian Zhang,
Abstract summary: We propose an explainable crashing fault localization approach by combining static analysis and LLM techniques. Our primary insight is that understanding the semantics of exception-throwing statements in the framework code can help find and apprehend the buggy methods in the app code. Based on this idea, first, we design the exception-thrown summary (ETS) that describes the key elements related to each framework-specific exception. Then we make data-tracking of its key elements to identify and sort buggy candidates for the given crash.
Score: 12.103194723136406
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Nowadays, many applications do not exist independently but rely on various frameworks or libraries. The frequent evolution and the complex implementation of framework APIs induce many unexpected post-release crashes. Starting from the crash stack traces, existing approaches either perform direct call graph (CG) tracing or construct datasets with similar crash-fixing records to locate buggy methods. However, these approaches are limited by the completeness of CG or dependent on historical fixing records. Moreover, they fail to explain the buggy candidates by revealing their relationship with the crashing point. To fill the gap, we propose an explainable crashing fault localization approach by combining static analysis and LLM techniques. Our primary insight is that understanding the semantics of exception-throwing statements in the framework code can help find and apprehend the buggy methods in the app code. Based on this idea, first, we design the exception-thrown summary (ETS) that describes the key elements related to each framework-specific exception and extract ETSs by performing static analysis. Then we make data-tracking of its key elements to identify and sort buggy candidates for the given crash. After that, we introduce LLMs to improve the explainability of the localization results. To construct effective LLM prompts, we design the candidate information summary (CIS) that describes multiple types of explanation-related contexts and then extract CISs via static analysis. We apply our approach to one typical scenario, i.e., locating Android framework-specific crashing faults, and implement a tool CrashTracker. For fault localization, it exhibited an overall MRR value of 0.91 in precision. For fault explanation, compared to the naive one produced by static analysis only, the LLM-powered explanation achieved a 67.04% improvement in users' satisfaction score.

Related papers

May the Feedback Be with You! Unlocking the Power of Feedback-Driven Deep Learning Framework Fuzzing via LLMs [13.976286931563006]
A simple yet effective way to find bugs in Deep Learning (DL) frameworks is fuzz testing (Fuzzing)<n>We propose FUEL to break the seal of Feedback-driven fuzzing for DL frameworks.<n> FUEL has detected 104 bugs for PyTorch and summaries, with 93 confirmed as new bugs, 47 already fixed, and 5 assigned with CVE IDs.
arXiv Detail & Related papers (2025-06-21T08:51:53Z)
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMs [17.497629884237647]
BugLens is a post-refinement framework that significantly improves static analysis precision. It raises precision from 0.10 (raw) and 0.50 (semi-automated refinement) to 0.72, substantially reducing false positives. Our results suggest that a structured LLM-based workflow can meaningfully enhance the effectiveness of static analysis tools.
arXiv Detail & Related papers (2025-04-16T02:17:06Z)
Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces [3.3158239079459655]
We present a novel approach to localize faults based only on the stack trace information and no additional runtime information. By fine-tuning on 64,369 crashes resulting from 4.1 million mutations of the code base, we can correctly predict the root cause location of a crash with an accuracy of 66.9%.
arXiv Detail & Related papers (2025-01-29T21:40:32Z)
Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program Repair [2.9176578730256733]
We propose DEVLoRe to use issue content (description and message) and stack error traces to localize buggy methods.<n>By incorporating different artifacts, DEVLoRe successfully locates 49.3% and 47.6% of single and non-single buggy methods.<n>This outperforms current state-of-the-art APR methods.
arXiv Detail & Related papers (2024-12-05T06:21:31Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
Are you still on track!? Catching LLM Task Drift with Activations [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users. We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set. We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z)
Crash Report Accumulation During Continuous Fuzzing [0.0]
We propose a crash accumulation method and implement it as part of the CASR toolset. We evaluate our approach on crash reports collected from fuzzing results.
arXiv Detail & Related papers (2024-05-28T13:36:31Z)
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction [75.25114727856861]
Large language models (LLMs) tend to suffer from deterioration at the latter stage ofSupervised fine-tuning process. We introduce a simple disperse-then-merge framework to address the issue. Our framework outperforms various sophisticated methods such as data curation and training regularization on a series of standard knowledge and reasoning benchmarks.
arXiv Detail & Related papers (2024-05-22T08:18:19Z)
Fake Alignment: Are LLMs Really Aligned Well? [91.26543768665778]
This study investigates the substantial discrepancy in performance between multiple-choice questions and open-ended questions. Inspired by research on jailbreak attack patterns, we argue this is caused by mismatched generalization.
arXiv Detail & Related papers (2023-11-10T08:01:23Z)
Better patching using LLM prompting, via Self-Consistency [5.892272127970584]
Self-consistency (S-C) is an exciting, substantially better technique for generating explanations for problems. This paper describes an application of the S-C approach to program repair, using the commit log on the fix as the explanation. We achieve state-of-the art results, beating previous approaches to prompting-based program repair on the MODIT dataset.
arXiv Detail & Related papers (2023-05-31T18:28:46Z)
Large-scale Crash Localization using Multi-Task Learning [3.4383679424643456]
We develop a novel multi-task sequence labeling approach for identifying blamed frames in stack traces. We evaluate our model with over a million real-world crashes from four popular Microsoft applications.
arXiv Detail & Related papers (2021-09-29T10:26:57Z)
S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning. It is based on a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
A Fault Localization and Debugging Support Framework driven by Bug Tracking Data [0.11915976684257382]
This thesis aims to provide a fault localization framework by combining data from various sources. To achieve this, a bug classification schema is introduced, benchmarks are created, and a novel fault localization method based on historical data is proposed.
arXiv Detail & Related papers (2021-03-03T13:23:13Z)
D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools. We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z)
TIDE: A General Toolbox for Identifying Object Detection Errors [28.83233218686898]
We introduce TIDE, a framework and associated toolbox for analyzing the sources of error in object detection and instance segmentation algorithms. Our framework is applicable across datasets and can be applied directly to output prediction files without required knowledge of the underlying prediction system.
arXiv Detail & Related papers (2020-08-18T18:30:53Z)
Tracking Road Users using Constraint Programming [79.32806233778511]
We present a constraint programming (CP) approach for the data association phase found in the tracking-by-detection paradigm of the multiple object tracking (MOT) problem. Our proposed method was tested on a motorized vehicles tracking dataset and produces results that outperform the top methods of the UA-DETRAC benchmark.
arXiv Detail & Related papers (2020-03-10T00:04:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.