SURE: A Visualized Failure Indexing Approach using Program Memory
Spectrum
- URL: http://arxiv.org/abs/2310.12415v2
- Date: Thu, 2 Nov 2023 08:17:56 GMT
- Title: SURE: A Visualized Failure Indexing Approach using Program Memory
Spectrum
- Authors: Yi Song, Xihao Zhang, Xiaoyuan Xie, Songqiang Chen, Quanming Liu,
Ruizhi Gao
- Abstract summary: We propose SURE, a viSUalized failuRe indExing approach using the program memory spectrum.
We first collect the run-time memory information at preset breakpoints during the execution of failed test cases.
Any pair of PMS images that serve as proxies for two failures is fed to a trained Siamese convolutional neural network.
- Score: 2.4151044161696587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Failure indexing is a longstanding crux in software testing and debugging,
the goal of which is to automatically divide failures (e.g., failed test cases)
into distinct groups according to the culprit root causes, as such multiple
faults in a faulty program can be handled independently and simultaneously.
This community has long been plagued by two challenges: 1) The effectiveness of
division is still far from promising. Existing techniques only employ a limited
source of run-time data (e.g., code coverage) to be failure proximity, which
typically delivers unsatisfactory results. 2) The outcome can be hardly
comprehensible. A developer who receives the failure indexing result does not
know why all failures should be divided the way they are. This leads to
difficulties for developers to be convinced by the result, which in turn
affects the adoption of the results. To tackle these challenges, in this paper,
we propose SURE, a viSUalized failuRe indExing approach using the program
memory spectrum. We first collect the run-time memory information at preset
breakpoints during the execution of failed test cases, and transform it into
human-friendly images (called program memory spectrum, PMS). Then, any pair of
PMS images that serve as proxies for two failures is fed to a trained Siamese
convolutional neural network, to predict the likelihood of them being triggered
by the same fault. Results demonstrate the effectiveness of SURE: It achieves
101.20% and 41.38% improvements in faults number estimation, as well as 105.20%
and 35.53% improvements in clustering, compared with the state-of-the-art
technique in this field, in simulated and real-world environments,
respectively. Moreover, we carry out a human study to quantitatively evaluate
the comprehensibility of PMS, revealing that this novel type of representation
can help developers better comprehend failure indexing results.
Related papers
- BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts [5.402030962296633]
Early Exit techniques have emerged as a means to reduce inference latency in Deep Neural Networks (DNNs)
We propose a new decision criterion where exit classifiers are treated as experts BEEM and aggregate their confidence scores.
We show that our method enhances the performance of state-of-the-art EE methods, achieving improvements in speed-up by a factor 1.5x to 2.1x.
arXiv Detail & Related papers (2025-02-02T10:35:19Z) - Can Search-Based Testing with Pareto Optimization Effectively Cover Failure-Revealing Test Inputs? [2.038863628148453]
We argue that search-based software testing (SBST) is inadequate for covering failure-inducing areas within a search domain.
We measure the coverage of failure-revealing test inputs in the input space using a metric that we refer to as the Coverage Inverted Distance quality indicator.
arXiv Detail & Related papers (2024-10-15T16:44:40Z) - Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress [31.952925824381325]
We propose a runtime monitoring framework that splits the detection of failures into two complementary categories.
We use Vision Language Models (VLMs) to detect when the policy confidently and consistently takes actions that do not solve the task.
By unifying temporal consistency detection and VLM runtime monitoring, Sentinel detects 18% more failures than using either of the two detectors alone.
arXiv Detail & Related papers (2024-10-06T22:13:30Z) - Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.
We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - Planning for Sample Efficient Imitation Learning [52.44953015011569]
Current imitation algorithms struggle to achieve high performance and high in-environment sample efficiency simultaneously.
We propose EfficientImitate, a planning-based imitation learning method that can achieve high in-environment sample efficiency and performance simultaneously.
Experimental results show that EI achieves state-of-the-art results in performance and sample efficiency.
arXiv Detail & Related papers (2022-10-18T05:19:26Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints [3.2374399328078285]
Graphical structures estimated by causal learning algorithms from time series data can provide misleading causal information if the causal timescale of the generating process fails to match the measurement timescale of the data.
Existing algorithms provide limited resources to respond to this challenge, and so researchers must either use models that they know are likely misleading, or else forego causal learning entirely.
Existing methods face up-to-four distinct shortfalls, as they might 1) require that the difference between causal and measurement is known; 2) only handle very small number of random variables when the timescale difference is unknown; 3) only apply to pairs of variables; or 4) be unable to
arXiv Detail & Related papers (2022-05-18T22:38:57Z) - Intervention Efficient Algorithm for Two-Stage Causal MDPs [15.838256272508357]
We study Markov Decision Processes (MDP) wherein states correspond to causal graphs that generate rewards.
In this setup, the learner's goal is to identify atomic interventions that lead to high rewards by intervening on variables at each state.
Generalizing the recent causal-bandit framework, the current work develops (simple) regret minimization guarantees for two-stage causal MDPs.
arXiv Detail & Related papers (2021-11-01T12:22:37Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - Global Optimization of Objective Functions Represented by ReLU Networks [77.55969359556032]
Neural networks can learn complex, non- adversarial functions, and it is challenging to guarantee their correct behavior in safety-critical contexts.
Many approaches exist to find failures in networks (e.g., adversarial examples), but these cannot guarantee the absence of failures.
We propose an approach that integrates the optimization process into the verification procedure, achieving better performance than the naive approach.
arXiv Detail & Related papers (2020-10-07T08:19:48Z) - DARTS-: Robustly Stepping out of Performance Collapse Without Indicators [74.21019737169675]
Differentiable architecture search suffers from long-standing performance instability.
indicators such as Hessian eigenvalues are proposed as a signal to stop searching before the performance collapses.
In this paper, we undertake a more subtle and direct approach to resolve the collapse.
arXiv Detail & Related papers (2020-09-02T12:54:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.