Related papers: Towards optimally abstaining from prediction

Towards optimally abstaining from prediction

URL: http://arxiv.org/abs/2105.14119v1
Date: Fri, 28 May 2021 21:44:48 GMT
Title: Towards optimally abstaining from prediction
Authors: Adam Tauman Kalai, Varun Kanade
Abstract summary: A common challenge across all areas of machine learning is that training data is not distributed like test data. We consider a model where one may abstain from predicting, at a fixed cost. Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser ( 2020) for transductive binary classification.
Score: 22.937799541125607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A common challenge across all areas of machine learning is that training data is not distributed like test data, due to natural shifts, "blind spots," or adversarial examples. We consider a model where one may abstain from predicting, at a fixed cost. In particular, our transductive abstention algorithm takes labeled training examples and unlabeled test examples as input, and provides predictions with optimal prediction loss guarantees. The loss bounds match standard generalization bounds when test examples are i.i.d. from the training distribution, but add an additional term that is the cost of abstaining times the statistical distance between the train and test distribution (or the fraction of adversarial examples). For linear regression, we give a polynomial-time algorithm based on Celis-Dennis-Tapia optimization algorithms. For binary classification, we show how to efficiently implement it using a proper agnostic learner (i.e., an Empirical Risk Minimizer) for the class of interest. Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser (2020) for transductive binary classification.

Related papers

Optimal Algorithms for Augmented Testing of Discrete Distributions [25.818433126197036]
We show that a predictor can indeed reduce the number of samples required for all three property testing tasks. A key advantage of our algorithms is their adaptability to the precision of the prediction. We provide lower bounds to indicate that the improvements in sample complexity achieved by our algorithms are information-theoretically optimal.
arXiv Detail & Related papers (2024-12-01T21:31:22Z)
Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning. One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems. We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z)
Loss-to-Loss Prediction: Scaling Laws for All Datasets [17.078832037614397]
We derive a strategy for predicting one loss from another and apply it to predict across different pre-training datasets. Our predictions extrapolate well even at 20x the largest FLOP budget used to fit the curves.
arXiv Detail & Related papers (2024-11-19T23:23:16Z)
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
Partial-Label Learning with a Reject Option [3.1201323892302444]
We propose a novel partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions. Our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors.
arXiv Detail & Related papers (2024-02-01T13:41:44Z)
Intersection of Parallels as an Early Stopping Criterion [64.8387564654474]
We propose a method to spot an early stopping point in the training iterations without the need for a validation set. For a wide range of learning rates, our method, called Cosine-Distance Criterion (CDC), leads to better generalization on average than all the methods that we compare against.
arXiv Detail & Related papers (2022-08-19T19:42:41Z)
CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time. We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z)
Selective Prediction via Training Dynamics [31.708701583736644]
We show that state-of-the-art selective prediction performance can be attained solely from studying the training dynamics of a model.<n>In particular, we reject data points exhibiting too much disagreement with the final prediction at late stages in training.<n>The proposed rejection mechanism is domain-agnostic (i.e., it works for both discrete and real-valued prediction) and can be flexibly combined with existing selective prediction approaches.
arXiv Detail & Related papers (2022-05-26T17:51:29Z)
Efficient and Differentiable Conformal Prediction with General Function Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters. We show that it achieves approximate valid population coverage and near-optimal efficiency within class. Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z)
Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next. In such settings, there is a distinct type of distribution shift between the training and test data. We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z)
Coherent False Seizure Prediction in Epilepsy, Coincidence or Providence? [0.2770822269241973]
Seizure forecasting using machine learning is possible, but the performance is far from ideal. Here, we examine false and missing alarms of two algorithms on long-term datasets.
arXiv Detail & Related papers (2021-10-26T10:25:14Z)
Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
Robust Fairness-aware Learning Under Sample Selection Bias [17.09665420515772]
We propose a framework for robust and fair learning under sample selection bias. We develop two algorithms to handle sample selection bias when test data is both available and unavailable.
arXiv Detail & Related papers (2021-05-24T23:23:36Z)
Model adaptation and unsupervised learning with non-stationary batch data under smooth concept drift [8.068725688880772]
Most predictive models assume that training and test data are generated from a stationary process. We consider the scenario of a gradual concept drift due to the underlying non-stationarity of the data source. We propose a novel, iterative algorithm for unsupervised adaptation of predictive models.
arXiv Detail & Related papers (2020-02-10T21:29:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.