Learning to Defer to Multiple Experts: Consistent Surrogate Losses,
Confidence Calibration, and Conformal Ensembles
- URL: http://arxiv.org/abs/2210.16955v1
- Date: Sun, 30 Oct 2022 21:27:29 GMT
- Title: Learning to Defer to Multiple Experts: Consistent Surrogate Losses,
Confidence Calibration, and Conformal Ensembles
- Authors: Rajeev Verma, Daniel Barrej\'on, Eric Nalisnick
- Abstract summary: We study the statistical properties of learning to defer (L2D) to multiple experts.
We address the open problems of deriving a consistent surrogate loss, confidence calibration, and principled ensembling of experts.
- Score: 0.966840768820136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the statistical properties of learning to defer (L2D) to multiple
experts. In particular, we address the open problems of deriving a consistent
surrogate loss, confidence calibration, and principled ensembling of experts.
Firstly, we derive two consistent surrogates -- one based on a softmax
parameterization, the other on a one-vs-all (OvA) parameterization -- that are
analogous to the single expert losses proposed by Mozannar and Sontag (2020)
and Verma and Nalisnick (2022), respectively. We then study the frameworks'
ability to estimate P( m_j = y | x ), the probability that the jth expert will
correctly predict the label for x. Theory shows the softmax-based loss causes
mis-calibration to propagate between the estimates while the OvA-based loss
does not (though in practice, we find there are trade offs). Lastly, we propose
a conformal inference technique that chooses a subset of experts to query when
the system defers. We perform empirical validation on tasks for galaxy, skin
lesion, and hate speech classification.
Related papers
- Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness [106.52630978891054]
We present a taxonomy of uncertainty specific to vision-language AI systems.
We also introduce a new metric confidence-weighted accuracy, that is well correlated with both accuracy and calibration error.
arXiv Detail & Related papers (2024-07-02T04:23:54Z) - Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts [78.3687645289918]
We show that the sigmoid gating function enjoys a higher sample efficiency than the softmax gating for the statistical task of expert estimation.
We find that experts formulated as feed-forward networks with commonly used activation such as ReLU and GELU enjoy faster convergence rates under the sigmoid gating.
arXiv Detail & Related papers (2024-05-22T21:12:34Z) - Regression with Multi-Expert Deferral [30.389055604165222]
Learning to defer with multiple experts is a framework where the learner can choose to defer the prediction to several experts.
We present a novel framework of regression with deferral, which involves deferring the prediction to multiple experts.
We introduce new surrogate loss functions for both scenarios and prove that they are supported by $H$-consistency bounds.
arXiv Detail & Related papers (2024-03-28T15:26:38Z) - In Defense of Softmax Parametrization for Calibrated and Consistent
Learning to Defer [27.025808709031864]
It has been theoretically shown that popular estimators for learning to defer parameterized with softmax provide unbounded estimates for the likelihood of deferring.
We show that the cause of the miscalibrated and unbounded estimator in prior literature is due to the symmetric nature of the surrogate losses used and not due to softmax.
We propose a novel statistically consistent asymmetric softmax-based surrogate loss that can produce valid estimates without the issue of unboundedness.
arXiv Detail & Related papers (2023-11-02T09:15:52Z) - Predictor-Rejector Multi-Class Abstention: Theoretical Analysis and Algorithms [30.389055604165222]
We study the key framework of learning with abstention in the multi-class classification setting.
In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost.
We introduce several new families of surrogate losses for which we prove strong non-asymptotic and hypothesis set-specific consistency guarantees.
arXiv Detail & Related papers (2023-10-23T10:16:27Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Loss Minimization through the Lens of Outcome Indistinguishability [11.709566373491619]
We present a new perspective on convex loss and the recent notion of Omniprediction.
By design, Loss OI implies omniprediction in a direct and intuitive manner.
We show that Loss OI for the important set of losses arising from Generalized Models, without requiring full multicalibration.
arXiv Detail & Related papers (2022-10-16T22:25:27Z) - Trustworthy Long-Tailed Classification [41.45744960383575]
We propose a Trustworthy Long-tailed Classification (TLC) method to jointly conduct classification and uncertainty estimation.
Our TLC obtains the evidence-based uncertainty (EvU) and evidence for each expert, and then combines these uncertainties and evidences under the Dempster-Shafer Evidence Theory (DST)
The experimental results show that the proposed TLC outperforms the state-of-the-art methods and is trustworthy with reliable uncertainty.
arXiv Detail & Related papers (2021-11-17T10:52:36Z) - Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains.
We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue.
We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z) - Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry.
Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.