Demonstrating Software Reliability using Possibly Correlated Tests:
Insights from a Conservative Bayesian Approach
- URL: http://arxiv.org/abs/2208.07935v3
- Date: Wed, 11 Oct 2023 13:18:41 GMT
- Title: Demonstrating Software Reliability using Possibly Correlated Tests:
Insights from a Conservative Bayesian Approach
- Authors: Kizito Salako, Xingyu Zhao
- Abstract summary: We formalise informal notions of "doubting" that the executions are independent.
We develop techniques that reveal the extent to which independence assumptions can undermine conservatism in assessments.
- Score: 2.152298082788376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents Bayesian techniques for conservative claims about
software reliability, particularly when evidence suggests the software's
executions are not statistically independent. We formalise informal notions of
"doubting" that the executions are independent, and incorporate such doubts
into reliability assessments. We develop techniques that reveal the extent to
which independence assumptions can undermine conservatism in assessments, and
identify conditions under which this impact is not significant. These
techniques - novel extensions of conservative Bayesian inference (CBI)
approaches - give conservative confidence bounds on the software's failure
probability per execution. With illustrations in two application areas -
nuclear power-plant safety and autonomous vehicle (AV) safety - our analyses
reveals: 1) the confidence an assessor should possess before subjecting a
system to operational testing. Otherwise, such testing is futile - favourable
operational testing evidence will eventually decrease one's confidence in the
system being sufficiently reliable; 2) the independence assumption supports
conservative claims sometimes; 3) in some scenarios, observing a system operate
without failure gives less confidence in the system than if some failures had
been observed; 4) building confidence in a system is very sensitive to failures
- each additional failure means significantly more operational testing is
required, in order to support a reliability claim.
Related papers
- Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation.
We propose methods tailored to the unique properties of perception and decision-making.
We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z) - On the Robustness of Adversarial Training Against Uncertainty Attacks [9.180552487186485]
In learning problems, the noise inherent to the task at hand hinders the possibility to infer without a certain degree of uncertainty.
In this work, we reveal both empirically and theoretically that defending against adversarial examples, i.e., carefully perturbed samples that cause misclassification, guarantees a more secure, trustworthy uncertainty estimate.
To support our claims, we evaluate multiple adversarial-robust models from the publicly available benchmark RobustBench on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2024-10-29T11:12:44Z) - Trustworthiness for an Ultra-Wideband Localization Service [2.4979362117484714]
This paper proposes a holistic trustworthiness assessment framework for ultra-wideband self-localization.
Our goal is to provide guidance for evaluating a system's trustworthiness based on objective evidence.
Our approach guarantees that the resulting trustworthiness indicators correspond to chosen real-world threats.
arXiv Detail & Related papers (2024-08-10T11:57:10Z) - Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Did You Mean...? Confidence-based Trade-offs in Semantic Parsing [52.28988386710333]
We show how a calibrated model can help balance common trade-offs in task-oriented parsing.
We then examine how confidence scores can help optimize the trade-off between usability and safety.
arXiv Detail & Related papers (2023-03-29T17:07:26Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Confidence Composition for Monitors of Verification Assumptions [3.500426151907193]
We propose a three-step framework for monitoring the confidence in verification assumptions.
In two case studies, we demonstrate that the composed monitors improve over their constituents and successfully predict safety violations.
arXiv Detail & Related papers (2021-11-03T18:14:35Z) - Reliability Testing for Natural Language Processing Systems [14.393308846231083]
We argue for the need for reliability testing and contextualize it among existing work on improving accountability.
We show how adversarial attacks can be reframed for this goal, via a framework for developing reliability tests.
arXiv Detail & Related papers (2021-05-06T11:24:58Z) - Assessing Safety-Critical Systems from Operational Testing: A Study on
Autonomous Vehicles [3.629865579485447]
Demonstrating high reliability and safety for safety-critical systems (SCSs) remains a hard problem.
We use Autonomous Vehicles (AVs) as a current example to revisit the problem of demonstrating high reliability.
arXiv Detail & Related papers (2020-08-19T19:50:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.