Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified
Framework
- URL: http://arxiv.org/abs/2306.13275v1
- Date: Fri, 23 Jun 2023 03:05:33 GMT
- Title: Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified
Framework
- Authors: Mahdiyar Molahasani, Michael Greenspan, Ali Etemad
- Abstract summary: Long-Tailed Recognition methods aim to accurately learn a dataset comprising both a larger Head set and a smaller Tail set.
We show that Continual Learning (CL) methods can effectively update the weights of the learner to learn the Tail without forgetting the Head.
We also assess the applicability of CL techniques on real-world data by exploring CL on the naturally imbalanced256 dataset.
- Score: 16.457778420360537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Long-Tailed Recognition (LTR) problem emerges in the context of learning
from highly imbalanced datasets, in which the number of samples among different
classes is heavily skewed. LTR methods aim to accurately learn a dataset
comprising both a larger Head set and a smaller Tail set. We propose a theorem
where under the assumption of strong convexity of the loss function, the
weights of a learner trained on the full dataset are within an upper bound of
the weights of the same learner trained strictly on the Head. Next, we assert
that by treating the learning of the Head and Tail as two separate and
sequential steps, Continual Learning (CL) methods can effectively update the
weights of the learner to learn the Tail without forgetting the Head. First, we
validate our theoretical findings with various experiments on the toy MNIST-LT
dataset. We then evaluate the efficacy of several CL strategies on multiple
imbalanced variations of two standard LTR benchmarks (CIFAR100-LT and
CIFAR10-LT), and show that standard CL methods achieve strong performance gains
in comparison to baselines and approach solutions that have been tailor-made
for LTR. We also assess the applicability of CL techniques on real-world data
by exploring CL on the naturally imbalanced Caltech256 dataset and demonstrate
its superiority over state-of-the-art classifiers. Our work not only unifies
LTR and CL but also paves the way for leveraging advances in CL methods to
tackle the LTR challenge more effectively.
Related papers
- Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL)
We take the first step by examining the pre-training dynamics of the emergence of ICL.
We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z) - Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets.
We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning.
The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z) - Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation [123.4883806344334]
We study a realistic Continual Learning setting where learning algorithms are granted a restricted computational budget per time step while training.
We apply this setting to large-scale semi-supervised Continual Learning scenarios with sparse label rates.
Our extensive analysis and ablations demonstrate that DietCL is stable under a full spectrum of label sparsity, computational budget, and various other ablations.
arXiv Detail & Related papers (2024-04-19T10:10:39Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - A dual-branch model with inter- and intra-branch contrastive loss for
long-tailed recognition [7.225494453600985]
Models trained on long-tailed datasets have poor adaptability to tail classes and the decision boundaries are ambiguous.
We propose a simple yet effective model, named Dual-Branch Long-Tailed Recognition (DB-LTR), which includes an imbalanced learning branch and a Contrastive Learning Branch (CoLB)
CoLB can improve the capability of the model in adapting to tail classes and assist the imbalanced learning branch to learn a well-represented feature space and discriminative decision boundary.
arXiv Detail & Related papers (2023-09-28T03:31:11Z) - Dynamic Residual Classifier for Class Incremental Learning [4.02487511510606]
With imbalanced sample numbers between old and new classes, the learning can be biased.
Existing CIL methods exploit the longtailed (LT) recognition techniques, e.g., the adjusted losses and the data re-sampling methods.
A novel Dynamic Residual adaptation (DRC) is proposed to handle this challenging scenario.
arXiv Detail & Related papers (2023-08-25T11:07:11Z) - Unbiased and Efficient Self-Supervised Incremental Contrastive Learning [31.763904668737304]
We propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of a novel Incremental InfoNCE (NCE-II) loss function.
ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results.
arXiv Detail & Related papers (2023-01-28T06:11:31Z) - A Study of Continual Learning Methods for Q-Learning [78.6363825307044]
We present an empirical study on the use of continual learning (CL) methods in a reinforcement learning (RL) scenario.
Our results show that dedicated CL methods can significantly improve learning when compared to the baseline technique of "experience replay"
arXiv Detail & Related papers (2022-06-08T14:51:52Z) - Generalized Variational Continual Learning [33.194866396158005]
Two main approaches to continuous learning are Online Elastic Weight Consolidation and Variational Continual Learning.
We show that applying this modification to mitigate Online EWC as a limiting case, allowing baselines between the two approaches.
In order to the observed overpruning effect of VI, we take inspiration from a common multi-task architecture, mitigate neural networks with task-specific FiLM layers.
arXiv Detail & Related papers (2020-11-24T19:07:39Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.