A Convex Loss Function for Set Prediction with Optimal Trade-offs Between Size and Conditional Coverage
- URL: http://arxiv.org/abs/2512.19142v1
- Date: Mon, 22 Dec 2025 08:41:31 GMT
- Title: A Convex Loss Function for Set Prediction with Optimal Trade-offs Between Size and Conditional Coverage
- Authors: Francis Bach,
- Abstract summary: We consider supervised learning problems in which set predictions provide explicit uncertainty estimates.<n>We propose a convex loss function for nondecreasing subset-valued functions obtained as level sets of a real-valued function.<n>We show how to naturally obtain sets with optimized conditional coverage.
- Score: 0.554780083433538
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider supervised learning problems in which set predictions provide explicit uncertainty estimates. Using Choquet integrals (a.k.a. Lov{รก}sz extensions), we propose a convex loss function for nondecreasing subset-valued functions obtained as level sets of a real-valued function. This loss function allows optimal trade-offs between conditional probabilistic coverage and the ''size'' of the set, measured by a non-decreasing submodular function. We also propose several extensions that mimic loss functions and criteria for binary classification with asymmetric losses, and show how to naturally obtain sets with optimized conditional coverage. We derive efficient optimization algorithms, either based on stochastic gradient descent or reweighted least-squares formulations, and illustrate our findings with a series of experiments on synthetic datasets for classification and regression tasks, showing improvements over approaches that aim for marginal coverage.
Related papers
- Convergence of a class of gradient-free optimisation schemes when the objective function is noisy, irregular, or both [4.258175645355975]
We investigate a class of algorithms designed to minimize a potentially non-smooth and noisy objective function.<n>The convergence results are obtained under very weak assumptions on the regularity of the objective function.<n>We illustrate the relevance of these algorithms and our convergence results through a challenging classification example from machine learning.
arXiv Detail & Related papers (2025-12-02T21:05:41Z) - Split Conformal Prediction in the Function Space with Neural Operators [7.619100818009453]
Conformal prediction offers finite-sample guarantees in finite-dimensional spaces.<n>It does not directly extend to function-valued outputs.<n>This work extends split conformal prediction to function spaces following a two step method.
arXiv Detail & Related papers (2025-09-04T19:12:04Z) - Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Off-policy estimation with adaptively collected data: the power of online learning [20.023469636707635]
We consider estimation of a linear functional of the treatment effect using adaptively collected data.
We propose a general reduction scheme that allows one to produce a sequence of estimates for the treatment effect via online learning.
arXiv Detail & Related papers (2024-11-19T10:18:27Z) - Semiparametric conformal prediction [79.6147286161434]
We construct a conformal prediction set accounting for the joint correlation structure of the vector-valued non-conformity scores.<n>We flexibly estimate the joint cumulative distribution function (CDF) of the scores.<n>Our method yields desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Nuisance Function Tuning and Sample Splitting for Optimal Doubly Robust Estimation [5.018363990542611]
We consider the problem of how to estimate nuisance functions to obtain optimal rates of convergence for a doubly robust nonparametric functional.
We show that plug-in and first-order biased-corrected estimators can achieve minimax rates of convergence across all H"older smoothness classes of the nuisance functions.
arXiv Detail & Related papers (2022-12-30T18:17:06Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - Risk Guarantees for End-to-End Prediction and Optimization Processes [0.0]
We study conditions that allow us to explicitly describe how the prediction performance governs the optimization performance.
We derive the exact theoretical relationship between prediction performance measured with the squared loss, as well as a class of symmetric loss functions, and the subsequent optimization performance.
arXiv Detail & Related papers (2020-12-30T05:20:26Z) - Mixability of Integral Losses: a Key to Efficient Online Aggregation of Functional and Probabilistic Forecasts [72.32459441619388]
We adapt basic mixable (and exponentially concave) loss functions to compare functional predictions and prove that these adaptations are also mixable (exp-concave)<n>As an application of our main result, we prove that various loss functions used for probabilistic forecasting are mixable (exp-concave)
arXiv Detail & Related papers (2019-12-15T14:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.