A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning
- URL: http://arxiv.org/abs/2205.13218v1
- Date: Thu, 26 May 2022 08:24:01 GMT
- Title: A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning
- Authors: Da-Wei Zhou, Qi-Wei Wang, Han-Jia Ye, De-Chuan Zhan
- Abstract summary: Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
- Score: 56.450090618578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-world applications require the classification model to adapt to new
classes without forgetting old ones. Correspondingly, Class-Incremental
Learning (CIL) aims to train a model with limited memory size to meet this
requirement. Typical CIL methods tend to save representative exemplars from
former classes to resist forgetting, while recent works find that storing
models from history can substantially boost the performance. However, the
stored models are not counted into the memory budget, which implicitly results
in unfair comparisons. We find that when counting the model size into the total
budget and comparing methods with aligned memory size, saving models do not
consistently work, especially for the case with limited memory budgets. As a
result, we need to holistically evaluate different CIL methods at different
memory scales and simultaneously consider accuracy and memory size for
measurement. On the other hand, we dive deeply into the construction of the
memory buffer for memory efficiency. By analyzing the effect of different
layers in the network, we find that shallow and deep layers have different
characteristics in CIL. Motivated by this, we propose a simple yet effective
baseline, denoted as MEMO for Memory-efficient Expandable MOdel. MEMO extends
specialized layers based on the shared generalized representations, efficiently
extracting diverse representations with modest cost and maintaining
representative exemplars. Extensive experiments on benchmark datasets validate
MEMO's competitive performance.
Related papers
- AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval [14.009257997448634]
This work investigates the problem of instance-level image retrieval re-ranking with the constraint of memory efficiency.
The proposed model uses a transformer-based architecture designed to estimate image-to-image similarity.
Results on standard benchmarks demonstrate the superiority of our approach over both hand-crafted and learned models.
arXiv Detail & Related papers (2024-08-06T16:29:51Z) - Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Class-Incremental Learning: A Survey [84.30083092434938]
Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally.
CIL tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades.
We provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms.
arXiv Detail & Related papers (2023-02-07T17:59:05Z) - Classification and Generation of real-world data with an Associative
Memory Model [0.0]
We extend the capabilities of the basic Associative Memory Model by using a Multiple-Modality framework.
By storing both the images and labels as modalities, a single Memory can be used to retrieve and complete patterns.
arXiv Detail & Related papers (2022-07-11T12:51:27Z) - Hierarchical Variational Memory for Few-shot Learning Across Domains [120.87679627651153]
We introduce a hierarchical prototype model, where each level of the prototype fetches corresponding information from the hierarchical memory.
The model is endowed with the ability to flexibly rely on features at different semantic levels if the domain shift circumstances so demand.
We conduct thorough ablation studies to demonstrate the effectiveness of each component in our model.
arXiv Detail & Related papers (2021-12-15T15:01:29Z) - Semantically Constrained Memory Allocation (SCMA) for Embedding in
Efficient Recommendation Systems [27.419109620575313]
A key challenge for deep learning models is to work with millions of categorical classes or tokens.
We propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information.
We demonstrate a significant reduction in the memory footprint while maintaining performance.
arXiv Detail & Related papers (2021-02-24T19:55:49Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z) - Learning to Ignore: Long Document Coreference with Bounded Memory Neural
Networks [65.3963282551994]
We argue that keeping all entities in memory is unnecessary, and we propose a memory-augmented neural network that tracks only a small bounded number of entities at a time.
We show that (a) the model remains competitive with models with high memory and computational requirements on OntoNotes and LitBank, and (b) the model learns an efficient memory management strategy easily outperforming a rule-based strategy.
arXiv Detail & Related papers (2020-10-06T15:16:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.