Omnipredictors
- URL: http://arxiv.org/abs/2109.05389v1
- Date: Sat, 11 Sep 2021 23:28:49 GMT
- Title: Omnipredictors
- Authors: Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan,
Udi Wieder
- Abstract summary: Loss minimization is a dominant paradigm in machine learning.
We introduce the notion of an ($mathcalL,mathcalC$)-omnipredictor, which could be used to optimize any loss in a family.
We show that such "loss-oblivious'' learning is feasible through a connection to multicalibration.
- Score: 19.735769148626588
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Loss minimization is a dominant paradigm in machine learning, where a
predictor is trained to minimize some loss function that depends on an
uncertain event (e.g., "will it rain tomorrow?''). Different loss functions
imply different learning algorithms and, at times, very different predictors.
While widespread and appealing, a clear drawback of this approach is that the
loss function may not be known at the time of learning, requiring the algorithm
to use a best-guess loss function. We suggest a rigorous new paradigm for loss
minimization in machine learning where the loss function can be ignored at the
time of learning and only be taken into account when deciding an action.
We introduce the notion of an (${\mathcal{L}},\mathcal{C}$)-omnipredictor,
which could be used to optimize any loss in a family ${\mathcal{L}}$. Once the
loss function is set, the outputs of the predictor can be post-processed (a
simple univariate data-independent transformation of individual predictions) to
do well compared with any hypothesis from the class $\mathcal{C}$. The post
processing is essentially what one would perform if the outputs of the
predictor were true probabilities of the uncertain events. In a sense,
omnipredictors extract all the predictive power from the class $\mathcal{C}$,
irrespective of the loss function in $\mathcal{L}$.
We show that such "loss-oblivious'' learning is feasible through a connection
to multicalibration, a notion introduced in the context of algorithmic
fairness. In addition, we show how multicalibration can be viewed as a solution
concept for agnostic boosting, shedding new light on past results. Finally, we
transfer our insights back to the context of algorithmic fairness by providing
omnipredictors for multi-group loss minimization.
Related papers
- Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms [80.37846867546517]
We show how to train eight different neural networks with custom objectives.
We exploit their second-order information via their empirical Fisherssian matrices.
We apply Loss Lossiable algorithms to achieve significant improvements for less differentiable algorithms.
arXiv Detail & Related papers (2024-10-24T18:02:11Z) - Omnipredictors for Regression and the Approximate Rank of Convex
Functions [7.549720420141907]
An textitomnipredictor for a class $mathcal L$ of loss functions and a class $mathcal C$ of hypotheses is a predictor whose predictions incur less expected loss than the best hypothesis in $mathcal C$ for every loss in $mathcal L$.
arXiv Detail & Related papers (2024-01-26T04:29:53Z) - Alternate Loss Functions for Classification and Robust Regression Can Improve the Accuracy of Artificial Neural Networks [6.452225158891343]
This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks.
Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed.
arXiv Detail & Related papers (2023-03-17T12:52:06Z) - Loss Minimization through the Lens of Outcome Indistinguishability [11.709566373491619]
We present a new perspective on convex loss and the recent notion of Omniprediction.
By design, Loss OI implies omniprediction in a direct and intuitive manner.
We show that Loss OI for the important set of losses arising from Generalized Models, without requiring full multicalibration.
arXiv Detail & Related papers (2022-10-16T22:25:27Z) - Omnipredictors for Constrained Optimization [5.969079530101132]
We show how to obtain omnipredictors for constrained optimization problems, relying on appropriate variants of multicalibration.
We also investigate the implications of this notion when the constraints used are so-called group fairness notions.
arXiv Detail & Related papers (2022-09-15T17:04:49Z) - Neural Greedy Pursuit for Feature Selection [72.4121881681861]
We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem.
We use neural networks as predictors in the algorithm to compute the loss.
arXiv Detail & Related papers (2022-07-19T16:39:16Z) - Refining neural network predictions using background knowledge [68.35246878394702]
We show we can use logical background knowledge in learning system to compensate for a lack of labeled training data.
We introduce differentiable refinement functions that find a corrected prediction close to the original prediction.
This algorithm finds optimal refinements on complex SAT formulas in significantly fewer iterations and frequently finds solutions where gradient descent can not.
arXiv Detail & Related papers (2022-06-10T10:17:59Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Randomized Exploration for Reinforcement Learning with General Value
Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm.
Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.
We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z) - Remember What You Want to Forget: Algorithms for Machine Unlearning [37.656345901739506]
We study the problem of forgetting datapoints from a learnt model.
We provide an unlearning algorithm that can delete up to $O(n/d1/4)$ samples.
arXiv Detail & Related papers (2021-03-04T19:28:57Z) - Supervised Learning: No Loss No Cry [51.07683542418145]
Supervised learning requires the specification of a loss function to minimise.
This paper revisits the sc SLIsotron algorithm of Kakade et al. (2011) through a novel lens.
We show how it provides a principled procedure for learning the loss.
arXiv Detail & Related papers (2020-02-10T05:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.