Towards optimally abstaining from prediction
- URL: http://arxiv.org/abs/2105.14119v1
- Date: Fri, 28 May 2021 21:44:48 GMT
- Title: Towards optimally abstaining from prediction
- Authors: Adam Tauman Kalai, Varun Kanade
- Abstract summary: A common challenge across all areas of machine learning is that training data is not distributed like test data.
We consider a model where one may abstain from predicting, at a fixed cost.
Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser ( 2020) for transductive binary classification.
- Score: 22.937799541125607
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A common challenge across all areas of machine learning is that training data
is not distributed like test data, due to natural shifts, "blind spots," or
adversarial examples. We consider a model where one may abstain from
predicting, at a fixed cost. In particular, our transductive abstention
algorithm takes labeled training examples and unlabeled test examples as input,
and provides predictions with optimal prediction loss guarantees. The loss
bounds match standard generalization bounds when test examples are i.i.d. from
the training distribution, but add an additional term that is the cost of
abstaining times the statistical distance between the train and test
distribution (or the fraction of adversarial examples). For linear regression,
we give a polynomial-time algorithm based on Celis-Dennis-Tapia optimization
algorithms. For binary classification, we show how to efficiently implement it
using a proper agnostic learner (i.e., an Empirical Risk Minimizer) for the
class of interest. Our work builds on a recent abstention algorithm of
Goldwasser, Kalais, and Montasser (2020) for transductive binary
classification.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Partial-Label Learning with a Reject Option [3.1201323892302444]
We propose a novel partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions.
Our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors.
arXiv Detail & Related papers (2024-02-01T13:41:44Z) - Intersection of Parallels as an Early Stopping Criterion [64.8387564654474]
We propose a method to spot an early stopping point in the training iterations without the need for a validation set.
For a wide range of learning rates, our method, called Cosine-Distance Criterion (CDC), leads to better generalization on average than all the methods that we compare against.
arXiv Detail & Related papers (2022-08-19T19:42:41Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Coherent False Seizure Prediction in Epilepsy, Coincidence or
Providence? [0.2770822269241973]
Seizure forecasting using machine learning is possible, but the performance is far from ideal.
Here, we examine false and missing alarms of two algorithms on long-term datasets.
arXiv Detail & Related papers (2021-10-26T10:25:14Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Robust Fairness-aware Learning Under Sample Selection Bias [17.09665420515772]
We propose a framework for robust and fair learning under sample selection bias.
We develop two algorithms to handle sample selection bias when test data is both available and unavailable.
arXiv Detail & Related papers (2021-05-24T23:23:36Z) - Semi-Supervised Learning for Sparsely-Labeled Sequential Data:
Application to Healthcare Video Processing [0.8312466807725921]
We propose a semi-supervised machine learning training strategy to improve event detection performance on sequential data.
Our method uses noisy guesses of the events' end times to train event detection models.
We show that our strategy outperforms conservative estimates by 12 points of mean average precision for MNIST, and 3.5 points for CIFAR.
arXiv Detail & Related papers (2020-11-28T09:54:44Z) - Model adaptation and unsupervised learning with non-stationary batch
data under smooth concept drift [8.068725688880772]
Most predictive models assume that training and test data are generated from a stationary process.
We consider the scenario of a gradual concept drift due to the underlying non-stationarity of the data source.
We propose a novel, iterative algorithm for unsupervised adaptation of predictive models.
arXiv Detail & Related papers (2020-02-10T21:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.