Two-stage Learning-to-Defer for Multi-Task Learning
- URL: http://arxiv.org/abs/2410.15729v2
- Date: Mon, 11 Nov 2024 09:15:21 GMT
- Title: Two-stage Learning-to-Defer for Multi-Task Learning
- Authors: Yannis Montreuil, Shu Heng Yeo, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi,
- Abstract summary: We introduce a Learning-to-Defer approach for multi-task learning that encompasses both classification and regression tasks.
Our two-stage approach utilizes a rejector that defers decisions to the most accurate agent among a pre-trained joint-regressor models and one or more external experts.
- Score: 3.4289478404209826
- License:
- Abstract: The Learning-to-Defer approach has been explored for classification and, more recently, regression tasks separately. Many contemporary learning tasks, however, involves both classification and regression components. In this paper, we introduce a Learning-to-Defer approach for multi-task learning that encompasses both classification and regression tasks. Our two-stage approach utilizes a rejector that defers decisions to the most accurate agent among a pre-trained joint classifier-regressor models and one or more external experts. We show that our surrogate loss is $(\mathcal{H}, \mathcal{F}, \mathcal{R})$ and Bayes--consistent, ensuring an effective approximation of the optimal solution. Additionally, we derive learning bounds that demonstrate the benefits of employing multiple confident experts along a rich model in a two-stage learning framework. Empirical experiments conducted on electronic health record analysis tasks underscore the performance enhancements achieved through our method.
Related papers
- Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques [65.55451717632317]
We study Multi-Agent Reinforcement Learning from Human Feedback (MARLHF), exploring both theoretical foundations and empirical validations.
We define the task as identifying Nash equilibrium from a preference-only offline dataset in general-sum games.
Our findings underscore the multifaceted approach required for MARLHF, paving the way for effective preference-based multi-agent systems.
arXiv Detail & Related papers (2024-09-01T13:14:41Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Weighted Ensemble Self-Supervised Learning [67.24482854208783]
Ensembling has proven to be a powerful technique for boosting model performance.
We develop a framework that permits data-dependent weighted cross-entropy losses.
Our method outperforms both in multiple evaluation metrics on ImageNet-1K.
arXiv Detail & Related papers (2022-11-18T02:00:17Z) - Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Stochastic Approach [38.76462300149459]
We develop a Multi-objective Correction (MoCo) method for multi-objective gradient optimization.
The unique feature of our method is that it can guarantee convergence without increasing the non fairness gradient.
arXiv Detail & Related papers (2022-10-23T05:54:26Z) - Consistency-Based Semi-supervised Evidential Active Learning for
Diagnostic Radiograph Classification [2.3545156585418328]
We introduce a novel Consistency-based Semi-supervised Evidential Active Learning framework (CSEAL)
We leverage predictive uncertainty based on theories of evidence and subjective logic to develop an end-to-end integrated approach.
Our approach can substantially improve accuracy on rarer abnormalities with fewer labelled samples.
arXiv Detail & Related papers (2022-09-05T09:28:31Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Learning Invariant Representation for Continual Learning [5.979373021392084]
A key challenge in Continual learning is catastrophically forgetting previously learned tasks when the agent faces a new one.
We propose a new pseudo-rehearsal-based method, named learning Invariant Representation for Continual Learning (IRCL)
Disentangling the shared invariant representation helps to learn continually a sequence of tasks, while being more robust to forgetting and having better knowledge transfer.
arXiv Detail & Related papers (2021-01-15T15:12:51Z) - Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification [106.08067870620218]
We propose a self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME)
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-06T12:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.