Task-conditioned Ensemble of Expert Models for Continuous Learning
- URL: http://arxiv.org/abs/2504.08626v2
- Date: Mon, 14 Apr 2025 20:37:11 GMT
- Title: Task-conditioned Ensemble of Expert Models for Continuous Learning
- Authors: Renu Sharma, Debasmita Pal, Arun Ross,
- Abstract summary: We propose a task-conditioned ensemble of models to maintain the performance of the existing model.<n>The method involves an ensemble of expert models based on task membership information.<n>Experiments highlight the benefits of the proposed method.
- Score: 9.973727349235261
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: One of the major challenges in machine learning is maintaining the accuracy of the deployed model (e.g., a classifier) in a non-stationary environment. The non-stationary environment results in distribution shifts and, consequently, a degradation in accuracy. Continuous learning of the deployed model with new data could be one remedy. However, the question arises as to how we should update the model with new training data so that it retains its accuracy on the old data while adapting to the new data. In this work, we propose a task-conditioned ensemble of models to maintain the performance of the existing model. The method involves an ensemble of expert models based on task membership information. The in-domain models-based on the local outlier concept (different from the expert models) provide task membership information dynamically at run-time to each probe sample. To evaluate the proposed method, we experiment with three setups: the first represents distribution shift between tasks (LivDet-Iris-2017), the second represents distribution shift both between and within tasks (LivDet-Iris-2020), and the third represents disjoint distribution between tasks (Split MNIST). The experiments highlight the benefits of the proposed method. The source code is available at https://github.com/iPRoBe-lab/Continuous_Learning_FE_DM.
Related papers
- Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning [51.177789437682954]
Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones.<n> Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown.<n>We propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.
arXiv Detail & Related papers (2025-02-11T13:57:30Z) - Test-Time Alignment via Hypothesis Reweighting [56.71167047381817]
Large pretrained models often struggle with underspecified tasks.<n>We propose a novel framework to address the challenge of aligning models to test-time user intent.
arXiv Detail & Related papers (2024-12-11T23:02:26Z) - Continual learning with task specialist [2.8830182365988923]
We propose Continual Learning with Task Specialists (CLTS) to address the issues of catastrophic forgetting and limited labelled data.
The model consists of Task Specialists (T S) and Task Predictor (T P) with pre-trained Stable Diffusion (SD) module.
A comparison study with four SOTA models conducted on three real-world datasets shows that the proposed model outperforms all the selected baselines.
arXiv Detail & Related papers (2024-09-26T12:59:09Z) - Fairness Hub Technical Briefs: Definition and Detection of Distribution Shift [0.5825410941577593]
Distribution shift is a common situation in machine learning tasks, where the data used for training a model is different from the data the model is applied to in the real world.
This brief focuses on the definition and detection of distribution shifts in educational settings.
arXiv Detail & Related papers (2024-05-23T05:29:36Z) - Task-customized Masked AutoEncoder via Mixture of Cluster-conditional
Experts [104.9871176044644]
Masked Autoencoder(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training.
We propose a novel MAE-based pre-training paradigm, Mixture of Cluster-conditional Experts (MoCE)
MoCE trains each expert only with semantically relevant images by using cluster-conditional gates.
arXiv Detail & Related papers (2024-02-08T03:46:32Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Building a Winning Team: Selecting Source Model Ensembles using a
Submodular Transferability Estimation Approach [20.86345962679122]
Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks.
We propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task.
arXiv Detail & Related papers (2023-09-05T17:57:31Z) - Mixture of basis for interpretable continual learning with distribution
shifts [1.6114012813668934]
Continual learning in environments with shifting data distributions is a challenging problem with several real-world applications.
We propose a novel approach called mixture of Basismodels (MoB) for addressing this problem setting.
arXiv Detail & Related papers (2022-01-05T22:53:15Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Learning Neural Models for Natural Language Processing in the Face of
Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications.
It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time.
This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information.
It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.