Validity-Preserving Delta Debugging via Generator Trace Reduction
- URL: http://arxiv.org/abs/2402.04623v3
- Date: Wed, 04 Dec 2024 15:09:31 GMT
- Title: Validity-Preserving Delta Debugging via Generator Trace Reduction
- Authors: Luyao Ren, Xing Zhang, Ziyue Hua, Yanyan Jiang, Xiao He, Yingfei Xiong, Tao Xie,
- Abstract summary: GReduce searches for other executions on the generator that yield reduced, valid test inputs.
GReduce substantially outperforms state-of-the-art syntax-based reducers including Perses and T-PDD.
- Score: 14.24086822861706
- License:
- Abstract: Reducing test inputs that trigger bugs is crucial for efficient debugging. Delta debugging is the most popular approach for this purpose. When test inputs need to conform to certain specifications, existing delta debugging practice encounters a validity problem: it blindly applies reduction rules, producing a large number of invalid test inputs that do not satisfy the required specifications. This overall diminishing effectiveness and efficiency becomes even more pronounced when the specifications extend beyond syntactical structures. Our key insight is that we should leverage input generators, which are aware of these specifications, to generate valid reduced inputs, rather than straightforwardly performing reduction on test inputs. In this paper, we propose a generator-based delta debugging method, namely GReduce, which derives validity-preserving reducers. Specifically, given a generator and its execution, demonstrating how the bug-inducing test input is generated, GReduce searches for other executions on the generator that yield reduced, valid test inputs. The evaluation results on five benchmarks (i.e., graphs, DL models, JavaScript programs, SymPy, and algebraic data types) show that GReduce substantially outperforms state-of-the-art syntax-based reducers including Perses and T-PDD, and also outperforms QuickCheck, SmartCheck, as well as the state-of-the-art choice-sequence-based reducer Hypothesis, demonstrating the effectiveness, efficiency, and versatility of GReduce.
Related papers
- Automated Proof Generation for Rust Code via Self-Evolution [69.25795662658356]
We introduce SAFE, a novel framework that overcomes the lack of human-written proof to enable automated proof generation of Rust code.
We demonstrate superior efficiency and precision compared to GPT-4o.
This advancement leads to a significant improvement in performance, achieving a 70.50% accuracy rate in a benchmark crafted by human experts.
arXiv Detail & Related papers (2024-10-21T08:15:45Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - FRGNN: Mitigating the Impact of Distribution Shift on Graph Neural
Networks via Test-Time Feature Reconstruction [13.21683198528012]
A distribution shift can adversely affect the test performance of Graph Neural Networks (GNNs)
We propose FR-GNN, a general framework for GNNs to conduct feature reconstruction.
Notably, the reconstructed node features can be directly utilized for testing the well-trained model.
arXiv Detail & Related papers (2023-08-18T02:34:37Z) - EEL: Efficiently Encoding Lattices for Reranking [44.77383151122229]
We use Transformers to efficiently encode lattices of generated outputs.
We combine this approach with a new class of token-factored rerankers (TFRs)
Our results show both substantial speedup compared to naive reranking and often better performance on downstream metrics than comparable approaches.
arXiv Detail & Related papers (2023-06-01T17:45:32Z) - Applying and Extending the Delta Debugging Algorithm for Elevator
Dispatching Algorithms (Experience Paper) [7.289672463326423]
In an elevator dispatching algorithm, it is of high benefit to provide the minimal test input to the software developers.
In this paper, we enhance this technique by first monitoring the environment at which the CPS operates as well as its physical states.
In a second step, we use such identified stable states to help the delta debug algorithm isolate the failure-inducing test inputs more efficiently.
arXiv Detail & Related papers (2023-05-28T19:27:24Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - A Fair Loss Function for Network Pruning [70.35230425589592]
We introduce the performance weighted loss function, a simple modified cross-entropy loss function that can be used to limit the introduction of biases during pruning.
Experiments using the CelebA, Fitzpatrick17k and CIFAR-10 datasets demonstrate that the proposed method is a simple and effective tool.
arXiv Detail & Related papers (2022-11-18T15:17:28Z) - TTAPS: Test-Time Adaption by Aligning Prototypes using Self-Supervision [70.05605071885914]
We propose a novel modification of the self-supervised training algorithm SwAV that adds the ability to adapt to single test samples.
We show the success of our method on the common benchmark dataset CIFAR10-C.
arXiv Detail & Related papers (2022-05-18T05:43:06Z) - Distribution-Aware Testing of Neural Networks Using Generative Models [5.618419134365903]
The reliability of software that has a Deep Neural Network (DNN) as a component is urgently important.
We show that three recent testing techniques generate significant number of invalid test inputs.
We propose a technique to incorporate the valid input space of the DNN model under test in the test generation process.
arXiv Detail & Related papers (2021-02-26T17:18:21Z) - PC-GAIN: Pseudo-label Conditional Generative Adversarial Imputation
Networks for Incomplete Data [19.952411963344556]
PC-GAIN is a novel unsupervised missing data imputation method named PC-GAIN.
We first propose a pre-training procedure to learn potential category information contained in a subset of low-missing-rate data.
Then an auxiliary classifier is determined using the synthetic pseudo-labels.
arXiv Detail & Related papers (2020-11-16T08:08:26Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.