Distribution-free risk assessment of regression-based machine learning
algorithms
- URL: http://arxiv.org/abs/2310.03545v1
- Date: Thu, 5 Oct 2023 13:57:24 GMT
- Title: Distribution-free risk assessment of regression-based machine learning
algorithms
- Authors: Sukrita Singh, Neeraj Sarna, Yuanyuan Li, Yang Li, Agni Orfanoudaki,
Michael Berger
- Abstract summary: We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction.
We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability.
- Score: 6.507711025292814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning algorithms have grown in sophistication over the years and
are increasingly deployed for real-life applications. However, when using
machine learning techniques in practical settings, particularly in high-risk
applications such as medicine and engineering, obtaining the failure
probability of the predictive model is critical. We refer to this problem as
the risk-assessment task. We focus on regression algorithms and the
risk-assessment task of computing the probability of the true label lying
inside an interval defined around the model's prediction. We solve the
risk-assessment problem using the conformal prediction approach, which provides
prediction intervals that are guaranteed to contain the true label with a given
probability. Using this coverage property, we prove that our approximated
failure probability is conservative in the sense that it is not lower than the
true failure probability of the ML algorithm. We conduct extensive experiments
to empirically study the accuracy of the proposed method for problems with and
without covariate shift. Our analysis focuses on different modeling regimes,
dataset sizes, and conformal prediction methodologies.
Related papers
- Beyond the Norms: Detecting Prediction Errors in Regression Models [26.178065248948773]
This paper tackles the challenge of detecting unreliable behavior in regression algorithms.
We introduce the notion of unreliability in regression, when the output of the regressor exceeds a specified discrepancy (or error)
We show empirical improvements in error detection for multiple regression tasks, consistently outperforming popular baseline approaches.
arXiv Detail & Related papers (2024-06-11T05:51:44Z) - Lightweight, Uncertainty-Aware Conformalized Visual Odometry [2.429910016019183]
Data-driven visual odometry (VO) is a critical subroutine for autonomous edge robotics.
Emerging edge robotics devices like insect-scale drones and surgical robots lack a computationally efficient framework to estimate VO's predictive uncertainties.
This paper presents a novel, lightweight, and statistically robust framework that leverages conformal inference (CI) to extract VO's uncertainty bands.
arXiv Detail & Related papers (2023-03-03T20:37:55Z) - Uncertainty Estimation based on Geometric Separation [13.588210692213568]
In machine learning, accurately predicting the probability that a specific input is correct is crucial for risk management.
We put forward a novel geometric-based approach for improving uncertainty estimations in machine learning models.
arXiv Detail & Related papers (2023-01-11T13:19:24Z) - A Geometric Method for Improved Uncertainty Estimation in Real-time [13.588210692213568]
Post-hoc model calibrations can improve models' uncertainty estimations without the need for retraining.
Our work puts forward a geometric-based approach for uncertainty estimation.
We show that our method yields better uncertainty estimations than recently proposed approaches.
arXiv Detail & Related papers (2022-06-23T09:18:05Z) - Automated Learning of Interpretable Models with Quantified Uncertainty [0.0]
We introduce a new framework for genetic-programming-based symbolic regression (GPSR)
GPSR uses model evidence to formulate replacement probability during the selection phase of evolution.
It is shown to increase interpretability, improve robustness to noise, and reduce overfitting when compared to a conventional GPSR implementation.
arXiv Detail & Related papers (2022-04-12T19:56:42Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.