Related papers: Validity-Preserving Delta Debugging via Generator Trace Reduction

Validity-Preserving Delta Debugging via Generator Trace Reduction

URL: http://arxiv.org/abs/2402.04623v3
Date: Wed, 04 Dec 2024 15:09:31 GMT
Title: Validity-Preserving Delta Debugging via Generator Trace Reduction
Authors: Luyao Ren, Xing Zhang, Ziyue Hua, Yanyan Jiang, Xiao He, Yingfei Xiong, Tao Xie,
Abstract summary: GReduce searches for other executions on the generator that yield reduced, valid test inputs.<n>GReduce substantially outperforms state-of-the-art syntax-based reducers including Perses and T-PDD.
Score: 14.24086822861706
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reducing test inputs that trigger bugs is crucial for efficient debugging. Delta debugging is the most popular approach for this purpose. When test inputs need to conform to certain specifications, existing delta debugging practice encounters a validity problem: it blindly applies reduction rules, producing a large number of invalid test inputs that do not satisfy the required specifications. This overall diminishing effectiveness and efficiency becomes even more pronounced when the specifications extend beyond syntactical structures. Our key insight is that we should leverage input generators, which are aware of these specifications, to generate valid reduced inputs, rather than straightforwardly performing reduction on test inputs. In this paper, we propose a generator-based delta debugging method, namely GReduce, which derives validity-preserving reducers. Specifically, given a generator and its execution, demonstrating how the bug-inducing test input is generated, GReduce searches for other executions on the generator that yield reduced, valid test inputs. The evaluation results on five benchmarks (i.e., graphs, DL models, JavaScript programs, SymPy, and algebraic data types) show that GReduce substantially outperforms state-of-the-art syntax-based reducers including Perses and T-PDD, and also outperforms QuickCheck, SmartCheck, as well as the state-of-the-art choice-sequence-based reducer Hypothesis, demonstrating the effectiveness, efficiency, and versatility of GReduce.

Related papers

XMutant: XAI-based Fuzzing for Deep Learning Systems [6.878645239814823]
XMutant is a technique that leverages explainable artificial intelligence (XAI) techniques to generate challenging test inputs. Our studies showed that XMutant enables more effective and efficient test generation by focusing on the most impactful parts of the input.
arXiv Detail & Related papers (2025-03-10T12:05:49Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
Automated Proof Generation for Rust Code via Self-Evolution [69.25795662658356]
We introduce SAFE, a novel framework that overcomes the lack of human-written proof to enable automated proof generation of Rust code. We demonstrate superior efficiency and precision compared to GPT-4o. This advancement leads to a significant improvement in performance, achieving a 70.50% accuracy rate in a benchmark crafted by human experts.
arXiv Detail & Related papers (2024-10-21T08:15:45Z)
Enriching Automatic Test Case Generation by Extracting Relevant Test Inputs from Bug Reports [8.85274953789614]
name is a technique for exploring bug reports to identify input values that can be fed to automatic test generation tools. For Defects4J projects, our study has shown that name successfully extracted 68.68% of relevant inputs when using regular expression in its approach.
arXiv Detail & Related papers (2023-12-22T18:19:33Z)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT) We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z)
FRGNN: Mitigating the Impact of Distribution Shift on Graph Neural Networks via Test-Time Feature Reconstruction [13.21683198528012]
A distribution shift can adversely affect the test performance of Graph Neural Networks (GNNs) We propose FR-GNN, a general framework for GNNs to conduct feature reconstruction. Notably, the reconstructed node features can be directly utilized for testing the well-trained model.
arXiv Detail & Related papers (2023-08-18T02:34:37Z)
Applying and Extending the Delta Debugging Algorithm for Elevator Dispatching Algorithms (Experience Paper) [7.289672463326423]
In an elevator dispatching algorithm, it is of high benefit to provide the minimal test input to the software developers. In this paper, we enhance this technique by first monitoring the environment at which the CPS operates as well as its physical states. In a second step, we use such identified stable states to help the delta debug algorithm isolate the failure-inducing test inputs more efficiently.
arXiv Detail & Related papers (2023-05-28T19:27:24Z)
Align-DETR: Improving DETR with Simple IoU-aware BCE loss [32.13866392998818]
We propose a metric, recall of best-regressed samples, to quantitively evaluate the misalignment problem. The proposed loss, IA-BCE, guides the training of DETR to build a strong correlation between classification score and localization precision. To overcome the dramatic decrease in sample quality induced by the sparsity of queries, we introduce a prime sample weighting mechanism.
arXiv Detail & Related papers (2023-04-15T10:24:51Z)
Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation. We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z)
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression [69.36555801766762]
We propose a hardware-aware tensor decomposition framework, dubbed HEAT, that enables efficient exploration of the exponential space of possible decompositions. We experimentally show that our hardware-aware factorized BERT variants reduce the energy-delay product by 5.7x with less than 1.1% accuracy loss.
arXiv Detail & Related papers (2022-11-30T05:31:45Z)
A Fair Loss Function for Network Pruning [70.35230425589592]
We introduce the performance weighted loss function, a simple modified cross-entropy loss function that can be used to limit the introduction of biases during pruning. Experiments using the CelebA, Fitzpatrick17k and CIFAR-10 datasets demonstrate that the proposed method is a simple and effective tool.
arXiv Detail & Related papers (2022-11-18T15:17:28Z)
TTAPS: Test-Time Adaption by Aligning Prototypes using Self-Supervision [70.05605071885914]
We propose a novel modification of the self-supervised training algorithm SwAV that adds the ability to adapt to single test samples. We show the success of our method on the common benchmark dataset CIFAR10-C.
arXiv Detail & Related papers (2022-05-18T05:43:06Z)
Latency Adjustable Transformer Encoder for Language Understanding [0.8287206589886879]
This paper proposes an efficient Transformer architecture that adjusts the inference computational cost adaptively with a desired inference latency speedup. The proposed method detects less important hidden sequence elements (word-vectors) and eliminates them in each encoder layer using a proposed Attention Context Contribution (ACC) metric. The proposed method mathematically and experimentally improves the inference latency of BERT_base and GPT-2 by up to 4.8 and 3.72 times with less than 0.75% accuracy drop and passable perplexity on average.
arXiv Detail & Related papers (2022-01-10T13:04:39Z)
Print Error Detection using Convolutional Neural Networks [0.0]
We propose a way to generate a print error sample artificially. Our final trained network gives a remarkable accuracy of 99.83% in testing.
arXiv Detail & Related papers (2021-04-11T16:30:17Z)
Distribution-Aware Testing of Neural Networks Using Generative Models [5.618419134365903]
The reliability of software that has a Deep Neural Network (DNN) as a component is urgently important. We show that three recent testing techniques generate significant number of invalid test inputs. We propose a technique to incorporate the valid input space of the DNN model under test in the test generation process.
arXiv Detail & Related papers (2021-02-26T17:18:21Z)
PC-GAIN: Pseudo-label Conditional Generative Adversarial Imputation Networks for Incomplete Data [19.952411963344556]
PC-GAIN is a novel unsupervised missing data imputation method named PC-GAIN. We first propose a pre-training procedure to learn potential category information contained in a subset of low-missing-rate data. Then an auxiliary classifier is determined using the synthetic pseudo-labels.
arXiv Detail & Related papers (2020-11-16T08:08:26Z)
Sampling-Decomposable Generative Adversarial Recommender [84.05894139540048]
We propose a Sampling-Decomposable Generative Adversarial Recommender (SD-GAR) In the framework, the divergence between some generator and the optimum is compensated by self-normalized importance sampling. We extensively evaluate the proposed algorithm with five real-world recommendation datasets.
arXiv Detail & Related papers (2020-11-02T13:19:10Z)
PRover: Proof Generation for Interpretable Reasoning over Rules [81.40404921232192]
We propose a transformer-based model that answers binary questions over rule-bases and generates the corresponding proofs. Our model learns to predict nodes and edges corresponding to proof graphs in an efficient constrained training paradigm. We conduct experiments on synthetic, hand-authored, and human-paraphrased rule-bases to show promising results for QA and proof generation.
arXiv Detail & Related papers (2020-10-06T15:47:53Z)
AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation. Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.