CrashJS: A NodeJS Benchmark for Automated Crash Reproduction
- URL: http://arxiv.org/abs/2405.05541v1
- Date: Thu, 9 May 2024 04:57:10 GMT
- Title: CrashJS: A NodeJS Benchmark for Automated Crash Reproduction
- Authors: Philip Oliver, Jens Dietrich, Craig Anslow, Michael Homer,
- Abstract summary: Software bugs often lead to software crashes, which cost US companies upwards of $2.08 trillion annually.
Automated Crash Reproduction aims to generate unit tests that successfully reproduce a crash.
CrashJS is a benchmark dataset of 453 Node.js crashes from several sources.
- Score: 4.3560886861249255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software bugs often lead to software crashes, which cost US companies upwards of $2.08 trillion annually. Automated Crash Reproduction (ACR) aims to generate unit tests that successfully reproduce a crash. The goal of ACR is to aid developers with debugging, providing them with another tool to locate where a bug is in a program. The main approach ACR currently takes is to replicate a stack trace from an error thrown within a program. Currently, ACR has been developed for C, Java, and Python, but there are no tools targeting JavaScript programs. To aid the development of JavaScript ACR tools, we propose CrashJS: a benchmark dataset of 453 Node.js crashes from several sources. CrashJS includes a mix of real-world and synthesised tests, multiple projects, and different levels of complexity for both crashes and target programs.
Related papers
- Mutation-Based Deep Learning Framework Testing Method in JavaScript Environment [16.67312523556796]
We propose a mutation-based JavaScript DL framework testing method named DLJSFuzzer.
DLJSFuzzer successfully detects 21 unique crashes and unique 126 NaN & Inconsistency bugs.
DLJSFuzzer has improved by over 47% in model generation efficiency and over 91% in bug detection efficiency compared to all baselines.
arXiv Detail & Related papers (2024-09-23T12:37:56Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - Concolic Testing of JavaScript using Sparkplug [6.902028735328818]
Insitu concolic testing for JS is effective but slow and complex.
Our method enhances tracing with V8 Sparkplug baseline compiler and remill libraries for assembly to LLVM IR conversion.
arXiv Detail & Related papers (2024-05-10T22:11:53Z) - CrashTranslator: Automatically Reproducing Mobile Application Crashes
Directly from Stack Trace [30.48737611250448]
This paper proposes an approach named CrashTranslator to automatically reproduce mobile application crashes directly from the stack trace.
We evaluate CrashTranslator on 75 crash reports involving 58 popular Android apps, and it successfully reproduces 61.3% of the crashes.
arXiv Detail & Related papers (2023-10-11T02:00:18Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Automatic Root Cause Analysis via Large Language Models for Cloud
Incidents [51.94361026233668]
We introduce RCACopilot, an on-call system empowered by a large language model for automating root cause analysis of cloud incidents.
RCACopilot matches incoming incidents to corresponding incident handlers based on their alert types, aggregates the critical runtime diagnostic information, predicts the incident's root cause category, and provides an explanatory narrative.
We evaluate RCACopilot using a real-world dataset consisting of a year's worth of incidents from Microsoft.
arXiv Detail & Related papers (2023-05-25T06:44:50Z) - RunBugRun -- An Executable Dataset for Automated Program Repair [15.670905650869704]
We present a fully executable dataset of 450,000 small buggy/fixed program pairs originally submitted to programming competition websites.
We provide infrastructure to compile, safely execute and test programs as well as fine-grained bug-type labels.
arXiv Detail & Related papers (2023-04-03T16:02:00Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z) - S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning.
It is based on a biLSTM encoder and a fully-connected classifier to compute similarity.
Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.