Related papers: TRUST-LAPSE: An Explainable and Actionable Mistrust Scoring Framework for Model Monitoring

TRUST-LAPSE: An Explainable and Actionable Mistrust Scoring Framework for Model Monitoring

URL: http://arxiv.org/abs/2207.11290v2
Date: Wed, 12 Jul 2023 19:26:04 GMT
Title: TRUST-LAPSE: An Explainable and Actionable Mistrust Scoring Framework for Model Monitoring
Authors: Nandita Bhaskhar, Daniel L. Rubin, Christopher Lee-Messer
Abstract summary: We propose TRUST-LAPSE, a "mistrust" scoring framework for continuous model monitoring. We assess the trustworthiness of each input sample's model prediction using a sequence of latent-space embeddings. Our latent-space mistrust scores achieve state-of-the-art results with AUROCs of 84.1 (vision), 73.9 (audio), and 77.1 (clinical EEGs)
Score: 4.262769931159288
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continuous monitoring of trained ML models to determine when their predictions should and should not be trusted is essential for their safe deployment. Such a framework ought to be high-performing, explainable, post-hoc and actionable. We propose TRUST-LAPSE, a "mistrust" scoring framework for continuous model monitoring. We assess the trustworthiness of each input sample's model prediction using a sequence of latent-space embeddings. Specifically, (a) our latent-space mistrust score estimates mistrust using distance metrics (Mahalanobis distance) and similarity metrics (cosine similarity) in the latent-space and (b) our sequential mistrust score determines deviations in correlations over the sequence of past input representations in a non-parametric, sliding-window based algorithm for actionable continuous monitoring. We evaluate TRUST-LAPSE via two downstream tasks: (1) distributionally shifted input detection, and (2) data drift detection. We evaluate across diverse domains - audio and vision using public datasets and further benchmark our approach on challenging, real-world electroencephalograms (EEG) datasets for seizure detection. Our latent-space mistrust scores achieve state-of-the-art results with AUROCs of 84.1 (vision), 73.9 (audio), and 77.1 (clinical EEGs), outperforming baselines by over 10 points. We expose critical failures in popular baselines that remain insensitive to input semantic content, rendering them unfit for real-world model monitoring. We show that our sequential mistrust scores achieve high drift detection rates; over 90% of the streams show < 20% error for all domains. Through extensive qualitative and quantitative evaluations, we show that our mistrust scores are more robust and provide explainability for easy adoption into practice.

Related papers

Typicalness-Aware Learning for Failure Detection [26.23185979968123]
Deep neural networks (DNNs) often suffer from the overconfidence issue, where incorrect predictions are made with high confidence scores. We propose a novel approach called Typicalness-Aware Learning (TAL) to address this issue and improve failure detection performance.
arXiv Detail & Related papers (2024-11-04T11:09:47Z)
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework [8.572441599469597]
We study high-confidence off-policy evaluation in the context of infinite-horizon Markov decision processes. The objective is to establish a confidence interval (CI) for the target policy value using only offline data pre-collected from unknown behavior policies. We show that our algorithm is sample-efficient, error-robust, and provably convergent even in non-linear function approximation settings.
arXiv Detail & Related papers (2023-09-23T06:35:44Z)
Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive? [93.10694819127608]
We propose a unified evaluation pipeline for forecasting methods with real-world perception inputs. Our in-depth study uncovers a substantial performance gap when transitioning from curated to perception-based data.
arXiv Detail & Related papers (2023-06-15T17:03:14Z)
Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative. We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z)
A Trustworthiness Score to Evaluate DNN Predictions [1.5484595752241122]
It is critical for safety during operation to know when deep neural networks' predictions are trustworthy or suspicious. We introduce the trustworthiness score (TS), a metric that provides a more transparent and effective way of providing confidence in predictions. We conduct a case study using YOLOv5 on persons detection to demonstrate our method and usage of TS and SS.
arXiv Detail & Related papers (2023-01-21T00:48:18Z)
Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation [30.34223543030105]
We show how NeighborAgg can leverage the two essential information via an adaptive neighborhood aggregation. We also extend our approach to the closely related task of mislabel detection and provide a theoretical coverage guarantee to bound the false negative.
arXiv Detail & Related papers (2022-11-29T18:43:15Z)
Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously. Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z)
Generalizability of Adversarial Robustness Under Distribution Shifts [57.767152566761304]
We take a first step towards investigating the interplay between empirical and certified adversarial robustness on one hand and domain generalization on another. We train robust models on multiple domains and evaluate their accuracy and robustness on an unseen domain. We extend our study to cover a real-world medical application, in which adversarial augmentation significantly boosts the generalization of robustness with minimal effect on clean data accuracy.
arXiv Detail & Related papers (2022-09-29T18:25:48Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.