Principled Approaches for Learning to Defer with Multiple Experts
- URL: http://arxiv.org/abs/2310.14774v2
- Date: Sun, 31 Mar 2024 09:15:36 GMT
- Title: Principled Approaches for Learning to Defer with Multiple Experts
- Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong,
- Abstract summary: We introduce a new family of surrogate losses specifically tailored for the multiple-expert setting.
We prove that these surrogate losses benefit from strong $H$-consistency bounds.
- Score: 30.389055604165222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a study of surrogate losses and algorithms for the general problem of learning to defer with multiple experts. We first introduce a new family of surrogate losses specifically tailored for the multiple-expert setting, where the prediction and deferral functions are learned simultaneously. We then prove that these surrogate losses benefit from strong $H$-consistency bounds. We illustrate the application of our analysis through several examples of practical surrogate losses, for which we give explicit guarantees. These loss functions readily lead to the design of new learning to defer algorithms based on their minimization. While the main focus of this work is a theoretical analysis, we also report the results of several experiments on SVHN and CIFAR-10 datasets.
Related papers
- Regression with Multi-Expert Deferral [30.389055604165222]
Learning to defer with multiple experts is a framework where the learner can choose to defer the prediction to several experts.
We present a novel framework of regression with deferral, which involves deferring the prediction to multiple experts.
We introduce new surrogate loss functions for both scenarios and prove that they are supported by $H$-consistency bounds.
arXiv Detail & Related papers (2024-03-28T15:26:38Z) - Predictor-Rejector Multi-Class Abstention: Theoretical Analysis and Algorithms [30.389055604165222]
We study the key framework of learning with abstention in the multi-class classification setting.
In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost.
We introduce several new families of surrogate losses for which we prove strong non-asymptotic and hypothesis set-specific consistency guarantees.
arXiv Detail & Related papers (2023-10-23T10:16:27Z) - Theoretically Grounded Loss Functions and Algorithms for Score-Based Multi-Class Abstention [30.389055604165222]
We introduce new families of surrogate losses for the abstention loss function.
We prove strong non-asymptotic and hypothesis set-specific consistency guarantees for these surrogate losses.
Our results show that the relative performance of the state-of-the-art score-based surrogate losses can vary across datasets.
arXiv Detail & Related papers (2023-10-23T10:13:35Z) - A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment
for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes.
On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains.
We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue.
We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost.
We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags.
Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z) - A Theory of Multiple-Source Adaptation with Limited Target Labeled Data [66.53679520072978]
We show that a new family of algorithms based on model selection ideas benefits from very favorable guarantees in this scenario.
We also report the results of several experiments with our algorithms that demonstrate their practical effectiveness.
arXiv Detail & Related papers (2020-07-19T19:34:48Z) - Consistent Estimators for Learning to Defer to an Expert [5.076419064097734]
We show how to learn predictors that can either predict or choose to defer the decision to a downstream expert.
We show the effectiveness of our approach on a variety of experimental tasks.
arXiv Detail & Related papers (2020-06-02T18:21:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.