Class Incremental Learning via Likelihood Ratio Based Task Prediction
- URL: http://arxiv.org/abs/2309.15048v4
- Date: Wed, 13 Mar 2024 14:24:28 GMT
- Title: Class Incremental Learning via Likelihood Ratio Based Task Prediction
- Authors: Haowei Lin, Yijia Shao, Weinan Qian, Ningxin Pan, Yiduo Guo, Bing Liu
- Abstract summary: An emerging theory-guided approach is to train a task-specific model for each task in a shared network for all tasks.
This paper argues that using a traditional OOD detector for task-id prediction is sub-optimal because additional information can be exploited.
We call the new method TPL (Task-id Prediction based on Likelihood Ratio)
It markedly outperforms strong CIL baselines and has negligible catastrophic forgetting.
- Score: 20.145128455767587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Class incremental learning (CIL) is a challenging setting of continual
learning, which learns a series of tasks sequentially. Each task consists of a
set of unique classes. The key feature of CIL is that no task identifier (or
task-id) is provided at test time. Predicting the task-id for each test sample
is a challenging problem. An emerging theory-guided approach (called TIL+OOD)
is to train a task-specific model for each task in a shared network for all
tasks based on a task-incremental learning (TIL) method to deal with
catastrophic forgetting. The model for each task is an out-of-distribution
(OOD) detector rather than a conventional classifier. The OOD detector can
perform both within-task (in-distribution (IND)) class prediction and OOD
detection. The OOD detection capability is the key to task-id prediction during
inference. However, this paper argues that using a traditional OOD detector for
task-id prediction is sub-optimal because additional information (e.g., the
replay data and the learned tasks) available in CIL can be exploited to design
a better and principled method for task-id prediction. We call the new method
TPL (Task-id Prediction based on Likelihood Ratio). TPL markedly outperforms
strong CIL baselines and has negligible catastrophic forgetting. The code of
TPL is publicly available at https://github.com/linhaowei1/TPL.
Related papers
- Class Incremental Learning with Task-Specific Batch Normalization and Out-of-Distribution Detection [25.224930928724326]
This study focuses on incremental learning for image classification, exploring how to reduce catastrophic forgetting of all learned knowledge when access to old data is restricted due to memory or privacy constraints.
The challenge of incremental learning lies in achieving an optimal balance between plasticity, the ability to learn new knowledge, and stability, the ability to retain old knowledge.
arXiv Detail & Related papers (2024-11-01T07:54:29Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Active Instruction Tuning: Improving Cross-Task Generalization by
Training on Prompt Sensitive Tasks [101.40633115037983]
Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions.
How to select new tasks to improve the performance and generalizability of IT models remains an open question.
We propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks.
arXiv Detail & Related papers (2023-11-01T04:40:05Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - A Multi-Head Model for Continual Learning via Out-of-Distribution Replay [16.189891444511755]
Many approaches have been proposed to deal with catastrophic forgetting (CF) in continual learning (CL)
This paper proposes an entirely different approach that builds a separate classifier (head) for each task (called a multi-head model) using a transformer network, called MORE.
arXiv Detail & Related papers (2022-08-20T19:17:12Z) - Evaluating NLP Systems On a Novel Cloze Task: Judging the Plausibility
of Possible Fillers in Instructional Texts [2.3449131636069898]
Cloze task is a widely used task to evaluate an NLP system's language understanding ability.
New task is proposed: predicting if a filler word in a cloze task is a good, neutral, or bad candidate.
arXiv Detail & Related papers (2021-12-03T12:02:52Z) - TOOD: Task-aligned One-stage Object Detection [41.43371563426291]
One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization.
We propose a Task-aligned One-stage Object Detection (TOOD) that explicitly aligns the two tasks in a learning-based manner.
Experiments are conducted on MS-COCO, where TOOD achieves a 51.1 AP at single-model single-scale testing.
arXiv Detail & Related papers (2021-08-17T17:00:01Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z) - MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive.
ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator.
We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.