Related papers: Towards a Framework for Deep Learning Certification in Safety-Critical Applications Using Inherently Safe Design and Run-Time Error Detection

Towards a Framework for Deep Learning Certification in Safety-Critical Applications Using Inherently Safe Design and Run-Time Error Detection

URL: http://arxiv.org/abs/2403.14678v1
Date: Tue, 12 Mar 2024 11:38:45 GMT
Title: Towards a Framework for Deep Learning Certification in Safety-Critical Applications Using Inherently Safe Design and Run-Time Error Detection
Authors: Romeo Valentin,
Abstract summary: We consider real-world problems arising in aviation and other safety-critical areas, and investigate their requirements for a certified model. We establish a new framework towards deep learning certification based on (i) inherently safe design, and (ii) run-time error detection.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although an ever-growing number of applications employ deep learning based systems for prediction, decision-making, or state estimation, almost no certification processes have been established that would allow such systems to be deployed in safety-critical applications. In this work we consider real-world problems arising in aviation and other safety-critical areas, and investigate their requirements for a certified model. To this end, we investigate methodologies from the machine learning research community aimed towards verifying robustness and reliability of deep learning systems, and evaluate these methodologies with regard to their applicability to real-world problems. Then, we establish a new framework towards deep learning certification based on (i) inherently safe design, and (ii) run-time error detection. Using a concrete use case from aviation, we show how deep learning models can recover disentangled variables through the use of weakly-supervised representation learning. We argue that such a system design is inherently less prone to common model failures, and can be verified to encode underlying mechanisms governing the data. Then, we investigate four techniques related to the run-time safety of a model, namely (i) uncertainty quantification, (ii) out-of-distribution detection, (iii) feature collapse, and (iv) adversarial attacks. We evaluate each for their applicability and formulate a set of desiderata that a certified model should fulfill. Finally, we propose a novel model structure that exhibits all desired properties discussed in this work, and is able to make regression and uncertainty predictions, as well as detect out-of-distribution inputs, while requiring no regression labels to train. We conclude with a discussion of the current state and expected future progress of deep learning certification, and its industrial and social implications.

Related papers

Deep Learning Models for Robust Facial Liveness Detection [56.08694048252482]
This study introduces a robust solution through novel deep learning models addressing the deficiencies in contemporary anti-spoofing techniques.<n>By innovatively integrating texture analysis and reflective properties associated with genuine human traits, our models distinguish authentic presence from replicas with remarkable precision.
arXiv Detail & Related papers (2025-08-12T17:19:20Z)
Towards Reliable Forgetting: A Survey on Machine Unlearning Verification [26.88376128769619]
This paper presents the first structured survey of machine unlearning verification methods.<n>We propose a taxonomy that organizes current techniques into two principal categories -- behavioral verification and parametric verification.<n>We examine their underlying assumptions, strengths, and limitations, and identify potential vulnerabilities in practical deployment.
arXiv Detail & Related papers (2025-06-18T03:33:59Z)
A Retention-Centric Framework for Continual Learning with Guaranteed Model Developmental Safety [75.8161094916476]
In real-world applications, learning-enabled systems often undergo iterative model development to address challenging or emerging tasks. New or improving existing capabilities may inadvertently lose good capabilities of the old model, also known as catastrophic forgetting. We propose a retention-centric framework with data-dependent constraints, and study how to continually develop a pretrained CLIP model for acquiring new or improving existing capabilities of image classification.
arXiv Detail & Related papers (2024-10-04T22:34:58Z)
Data-Driven Distributionally Robust Safety Verification Using Barrier Certificates and Conditional Mean Embeddings [0.24578723416255752]
We develop scalable formal verification algorithms without shifting the problem to unrealistic assumptions. In a pursuit of developing scalable formal verification algorithms without shifting the problem to unrealistic assumptions, we employ the concept of barrier certificates. We show how to solve the resulting program efficiently using sum-of-squares optimization and a Gaussian process envelope.
arXiv Detail & Related papers (2024-03-15T17:32:02Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
A Theoretical and Practical Framework for Evaluating Uncertainty Calibration in Object Detection [1.8843687952462744]
This work presents a novel theoretical and practical framework to evaluate object detection systems in the context of uncertainty calibration. The robustness of the proposed uncertainty calibration metrics is shown through a series of representative experiments.
arXiv Detail & Related papers (2023-09-01T14:02:44Z)
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification [0.491574468325115]
We present a large-scale empirical study for the first time enabling benchmarking confidence scoring functions. The revelation of a simple softmax response baseline as the overall best performing method underlines the drastic shortcomings of current evaluation.
arXiv Detail & Related papers (2022-11-28T12:25:27Z)
Robust Deep Learning for Autonomous Driving [0.0]
We introduce a new criterion to reliably estimate model confidence: the true class probability ( TCP) Since the true class is by essence unknown at test time, we propose to learn TCP criterion from data with an auxiliary model, introducing a specific learning scheme adapted to this context. We tackle the challenge of jointly detecting misclassification and out-of-distributions samples by introducing a new uncertainty measure based on evidential models and defined on the simplex.
arXiv Detail & Related papers (2022-11-14T22:07:11Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Multi Agent System for Machine Learning Under Uncertainty in Cyber Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing. Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it. In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis. Our method obtains comparable results over standard benchmarks with respect to formal verifiers. Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)
Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data. Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model. Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.