Related papers: CrashTranslator: Automatically Reproducing Mobile Application Crashes Directly from Stack Trace

CrashTranslator: Automatically Reproducing Mobile Application Crashes Directly from Stack Trace

URL: http://arxiv.org/abs/2310.07128v1
Date: Wed, 11 Oct 2023 02:00:18 GMT
Title: CrashTranslator: Automatically Reproducing Mobile Application Crashes Directly from Stack Trace
Authors: Yuchao Huang, Junjie Wang, Zhe Liu, Yawen Wang, Song Wang, Chunyang Chen, Yuanzhe Hu, Qing Wang
Abstract summary: This paper proposes an approach named CrashTranslator to automatically reproduce mobile application crashes directly from the stack trace. We evaluate CrashTranslator on 75 crash reports involving 58 popular Android apps, and it successfully reproduces 61.3% of the crashes.
Score: 30.48737611250448
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Crash reports are vital for software maintenance since they allow the developers to be informed of the problems encountered in the mobile application. Before fixing, developers need to reproduce the crash, which is an extremely time-consuming and tedious task. Existing studies conducted the automatic crash reproduction with the natural language described reproducing steps. Yet we find a non-neglectable portion of crash reports only contain the stack trace when the crash occurs. Such stack-trace-only crashes merely reveal the last GUI page when the crash occurs, and lack step-by-step guidance. Developers tend to spend more effort in understanding the problem and reproducing the crash, and existing techniques cannot work on this, thus calling for a greater need for automatic support. This paper proposes an approach named CrashTranslator to automatically reproduce mobile application crashes directly from the stack trace. It accomplishes this by leveraging a pre-trained Large Language Model to predict the exploration steps for triggering the crash, and designing a reinforcement learning based technique to mitigate the inaccurate prediction and guide the search holistically. We evaluate CrashTranslator on 75 crash reports involving 58 popular Android apps, and it successfully reproduces 61.3% of the crashes, outperforming the state-of-the-art baselines by 109% to 206%. Besides, the average reproducing time is 68.7 seconds, outperforming the baselines by 302% to 1611%. We also evaluate the usefulness of CrashTranslator with promising results.

Related papers

Fault Localization via Fine-tuning Large Language Models with Mutation Generated Stack Traces [3.3158239079459655]
We present a novel approach to localize faults based only on the stack trace information and no additional runtime information. By fine-tuning on 64,369 crashes resulting from 4.1 million mutations of the code base, we can correctly predict the root cause location of a crash with an accuracy of 66.9%.
arXiv Detail & Related papers (2025-01-29T21:40:32Z)
Why do Machine Learning Notebooks Crash? [1.8292110434077904]
We collect 64,031 ML notebooks containing 92,542 crashes from GitHub and Kaggle. We analyze a sample of 746 crashes across various aspects, including exception types and root causes. Our analysis reveals that 87% of crashes are caused by API misuse, data confusion, notebook-specific issues, environment problems, and implementation errors.
arXiv Detail & Related papers (2024-11-25T09:33:08Z)
Better Debugging: Combining Static Analysis and LLMs for Explainable Crashing Fault Localization [12.103194723136406]
We propose an explainable crashing fault localization approach by combining static analysis and LLM techniques. Our primary insight is that understanding the semantics of exception-throwing statements in the framework code can help find and apprehend the buggy methods in the app code. Based on this idea, first, we design the exception-thrown summary (ETS) that describes the key elements related to each framework-specific exception. Then we make data-tracking of its key elements to identify and sort buggy candidates for the given crash.
arXiv Detail & Related papers (2024-08-22T02:18:35Z)
AutoBencher: Towards Declarative Benchmark Construction [74.54640925146289]
We use AutoBencher to create datasets for math, multilinguality, knowledge, and safety. The scalability of AutoBencher allows it to test fine-grained categories knowledge, creating datasets that elicit 22% more model errors (i.e., difficulty) than existing benchmarks.
arXiv Detail & Related papers (2024-07-11T10:03:47Z)
VDebugger: Harnessing Execution Feedback for Debugging Visual Programs [103.61860743476933]
We introduce V Debugger, a critic-refiner framework trained to localize and debug visual programs by tracking execution step by step. V Debugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy. Evaluations on six datasets demonstrate V Debugger's effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy.
arXiv Detail & Related papers (2024-06-19T11:09:16Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
Crash Report Accumulation During Continuous Fuzzing [0.0]
We propose a crash accumulation method and implement it as part of the CASR toolset. We evaluate our approach on crash reports collected from fuzzing results.
arXiv Detail & Related papers (2024-05-28T13:36:31Z)
The Impact Of Bug Localization Based on Crash Report Mining: A Developers' Perspective [7.952391285456257]
We report our experience of using an approach for grouping crash reports and finding buggy code on a weekly basis for 18 months. The approach investigated in this study correctly suggested the buggy file most of the time -- the approach's precision was around 80%.
arXiv Detail & Related papers (2024-03-16T01:23:01Z)
Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation. We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z)
DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving. The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z)
Large-scale Crash Localization using Multi-Task Learning [3.4383679424643456]
We develop a novel multi-task sequence labeling approach for identifying blamed frames in stack traces. We evaluate our model with over a million real-world crashes from four popular Microsoft applications.
arXiv Detail & Related papers (2021-09-29T10:26:57Z)
Exploiting Playbacks in Unsupervised Domain Adaptation for 3D Object Detection [55.12894776039135]
State-of-the-art 3D object detectors, based on deep learning, have shown promising accuracy but are prone to over-fit to domain idiosyncrasies. We propose a novel learning approach that drastically reduces this gap by fine-tuning the detector on pseudo-labels in the target domain. We show, on five autonomous driving datasets, that fine-tuning the detector on these pseudo-labels substantially reduces the domain gap to new driving environments.
arXiv Detail & Related papers (2021-03-26T01:18:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.