Regression with Multi-Expert Deferral
- URL: http://arxiv.org/abs/2403.19494v1
- Date: Thu, 28 Mar 2024 15:26:38 GMT
- Title: Regression with Multi-Expert Deferral
- Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong,
- Abstract summary: Learning to defer with multiple experts is a framework where the learner can choose to defer the prediction to several experts.
We present a novel framework of regression with deferral, which involves deferring the prediction to multiple experts.
We introduce new surrogate loss functions for both scenarios and prove that they are supported by $H$-consistency bounds.
- Score: 30.389055604165222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning to defer with multiple experts is a framework where the learner can choose to defer the prediction to several experts. While this problem has received significant attention in classification contexts, it presents unique challenges in regression due to the infinite and continuous nature of the label space. In this work, we introduce a novel framework of regression with deferral, which involves deferring the prediction to multiple experts. We present a comprehensive analysis for both the single-stage scenario, where there is simultaneous learning of predictor and deferral functions, and the two-stage scenario, which involves a pre-trained predictor with a learned deferral function. We introduce new surrogate loss functions for both scenarios and prove that they are supported by $H$-consistency bounds. These bounds provide consistency guarantees that are stronger than Bayes consistency, as they are non-asymptotic and hypothesis set-specific. Our framework is versatile, applying to multiple experts, accommodating any bounded regression losses, addressing both instance-dependent and label-dependent costs, and supporting both single-stage and two-stage methods. A by-product is that our single-stage formulation includes the recent regression with abstention framework (Cheng et al., 2023) as a special case, where only a single expert, the squared loss and a label-independent cost are considered. Minimizing our proposed loss functions directly leads to novel algorithms for regression with deferral. We report the results of extensive experiments showing the effectiveness of our proposed algorithms.
Related papers
- Spectral Representation for Causal Estimation with Hidden Confounders [33.148766692274215]
We address the problem of causal effect estimation where hidden confounders are present.
Our approach uses a singular value decomposition of a conditional expectation operator, followed by a saddle-point optimization problem.
arXiv Detail & Related papers (2024-07-15T05:39:56Z) - Principled Approaches for Learning to Defer with Multiple Experts [30.389055604165222]
We introduce a new family of surrogate losses specifically tailored for the multiple-expert setting.
We prove that these surrogate losses benefit from strong $H$-consistency bounds.
arXiv Detail & Related papers (2023-10-23T10:19:09Z) - Predictor-Rejector Multi-Class Abstention: Theoretical Analysis and Algorithms [30.389055604165222]
We study the key framework of learning with abstention in the multi-class classification setting.
In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost.
We introduce several new families of surrogate losses for which we prove strong non-asymptotic and hypothesis set-specific consistency guarantees.
arXiv Detail & Related papers (2023-10-23T10:16:27Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Learning to Defer to Multiple Experts: Consistent Surrogate Losses,
Confidence Calibration, and Conformal Ensembles [0.966840768820136]
We study the statistical properties of learning to defer (L2D) to multiple experts.
We address the open problems of deriving a consistent surrogate loss, confidence calibration, and principled ensembling of experts.
arXiv Detail & Related papers (2022-10-30T21:27:29Z) - Mitigating multiple descents: A model-agnostic framework for risk
monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation.
We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z) - Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity.
We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.