Dynamic Model Agnostic Reliability Evaluation of Machine-Learning
Methods Integrated in Instrumentation & Control Systems
- URL: http://arxiv.org/abs/2308.05120v1
- Date: Tue, 8 Aug 2023 18:25:42 GMT
- Title: Dynamic Model Agnostic Reliability Evaluation of Machine-Learning
Methods Integrated in Instrumentation & Control Systems
- Authors: Edward Chen, Han Bao, Nam Dinh
- Abstract summary: Trustworthiness of datadriven neural network-based machine learning algorithms is not adequately assessed.
In recent reports by the National Institute for Standards and Technology, trustworthiness in ML is a critical barrier to adoption.
We demonstrate a real-time model-agnostic method to evaluate the relative reliability of ML predictions by incorporating out-of-distribution detection on the training dataset.
- Score: 1.8978726202765634
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, the field of data-driven neural network-based machine
learning (ML) algorithms has grown significantly and spurred research in its
applicability to instrumentation and control systems. While they are promising
in operational contexts, the trustworthiness of such algorithms is not
adequately assessed. Failures of ML-integrated systems are poorly understood;
the lack of comprehensive risk modeling can degrade the trustworthiness of
these systems. In recent reports by the National Institute for Standards and
Technology, trustworthiness in ML is a critical barrier to adoption and will
play a vital role in intelligent systems' safe and accountable operation. Thus,
in this work, we demonstrate a real-time model-agnostic method to evaluate the
relative reliability of ML predictions by incorporating out-of-distribution
detection on the training dataset. It is well documented that ML algorithms
excel at interpolation (or near-interpolation) tasks but significantly degrade
at extrapolation. This occurs when new samples are "far" from training samples.
The method, referred to as the Laplacian distributed decay for reliability
(LADDR), determines the difference between the operational and training
datasets, which is used to calculate a prediction's relative reliability. LADDR
is demonstrated on a feedforward neural network-based model used to predict
safety significant factors during different loss-of-flow transients. LADDR is
intended as a "data supervisor" and determines the appropriateness of
well-trained ML models in the context of operational conditions. Ultimately,
LADDR illustrates how training data can be used as evidence to support the
trustworthiness of ML predictions when utilized for conventional interpolation
tasks.
Related papers
- The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others [1.654278807602897]
This study introduces Misclassification Likelihood Matrix (MLM) as a novel tool for quantifying the reliability of neural network predictions under distribution shifts.
The implications of this work extend beyond image classification, with ongoing applications in autonomous systems, such as self-driving cars.
arXiv Detail & Related papers (2024-07-10T16:43:14Z) - Evaluation of Predictive Reliability to Foster Trust in Artificial
Intelligence. A case study in Multiple Sclerosis [0.34473740271026115]
Spotting Machine Learning failures is of paramount importance when ML predictions are used to drive clinical decisions.
We propose a simple approach that can be used in the deployment phase of any ML model to suggest whether to trust predictions or not.
Our method holds the promise to provide effective support to clinicians by spotting potential ML failures during deployment.
arXiv Detail & Related papers (2024-02-27T14:48:07Z) - Clustering and Uncertainty Analysis to Improve the Machine
Learning-based Predictions of SAFARI-1 Control Follower Assembly Axial
Neutron Flux Profiles [2.517043342442487]
The goal of this work is to develop accurate Machine Learning (ML) models for predicting the assembly axial neutron flux profiles in the SAFARI-1 research reactor.
The data-driven nature of ML models makes them susceptible to uncertainties which are introduced by sources such as noise in training data.
The aim of this work is to improve the ML models for the control assemblies by a combination of supervised and unsupervised ML algorithms.
arXiv Detail & Related papers (2023-12-20T20:22:13Z) - Scope Compliance Uncertainty Estimate [0.4262974002462632]
SafeML is a model-agnostic approach for performing such monitoring.
This work addresses these limitations by changing the binary decision to a continuous metric.
arXiv Detail & Related papers (2023-12-17T19:44:20Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Robustness, Evaluation and Adaptation of Machine Learning Models in the
Wild [4.304803366354879]
We study causes of impaired robustness to domain shifts and present algorithms for training domain robust models.
A key source of model brittleness is due to domain overfitting, which our new training algorithms suppress and instead encourage domain-general hypotheses.
arXiv Detail & Related papers (2023-03-05T21:41:16Z) - Stabilizing Machine Learning Prediction of Dynamics: Noise and
Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems.
In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability.
We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.