Learnable Expansion-and-Compression Network for Few-shot
Class-Incremental Learning
- URL: http://arxiv.org/abs/2104.02281v1
- Date: Tue, 6 Apr 2021 04:34:21 GMT
- Title: Learnable Expansion-and-Compression Network for Few-shot
Class-Incremental Learning
- Authors: Boyu Yang, Mingbao Lin, Binghao Liu, Mengying Fu, Chang Liu, Rongrong
Ji and Qixiang Ye
- Abstract summary: We propose a learnable expansion-and-compression network (LEC-Net) to solve catastrophic forgetting and model over-fitting problems.
LEC-Net enlarges the representation capacity of features, alleviating feature drift of old network from the perspective of model regularization.
Experiments on the CUB/CIFAR-100 datasets show that LEC-Net improves the baseline by 57% while outperforms the state-of-the-art by 56%.
- Score: 87.94561000910707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot class-incremental learning (FSCIL), which targets at continuously
expanding model's representation capacity under few supervisions, is an
important yet challenging problem. On the one hand, when fitting new tasks
(novel classes), features trained on old tasks (old classes) could
significantly drift, causing catastrophic forgetting. On the other hand,
training the large amount of model parameters with few-shot novel-class
examples leads to model over-fitting. In this paper, we propose a learnable
expansion-and-compression network (LEC-Net), with the aim to simultaneously
solve catastrophic forgetting and model over-fitting problems in a unified
framework. By tentatively expanding network nodes, LEC-Net enlarges the
representation capacity of features, alleviating feature drift of old network
from the perspective of model regularization. By compressing the expanded
network nodes, LEC-Net purses minimal increase of model parameters, alleviating
over-fitting of the expanded network from a perspective of compact
representation. Experiments on the CUB/CIFAR-100 datasets show that LEC-Net
improves the baseline by 5~7% while outperforms the state-of-the-art by 5~6%.
LEC-Net also demonstrates the potential to be a general incremental learning
approach with dynamic model expansion capability.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning [38.09011520275557]
Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones.
We propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL.
arXiv Detail & Related papers (2024-06-04T15:47:03Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Subnetwork-to-go: Elastic Neural Network with Dynamic Training and
Customizable Inference [16.564868336748503]
We propose a simple way to train a large network and flexibly extract a subnetwork from it given a model size or complexity constraint.
Experiment results on a music source separation model show that our proposed method can effectively improve the separation performance across different subnetwork sizes and complexities.
arXiv Detail & Related papers (2023-12-06T12:40:06Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - LilNetX: Lightweight Networks with EXtreme Model Compression and
Structured Sparsification [36.651329027209634]
LilNetX is an end-to-end trainable technique for neural networks.
It enables learning models with specified accuracy-rate-computation trade-off.
arXiv Detail & Related papers (2022-04-06T17:59:10Z) - Network Augmentation for Tiny Deep Learning [73.57192520534585]
We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks.
We demonstrate the effectiveness of NetAug on image classification and object detection.
arXiv Detail & Related papers (2021-10-17T18:48:41Z) - The Self-Simplifying Machine: Exploiting the Structure of Piecewise
Linear Neural Networks to Create Interpretable Models [0.0]
We introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks.
Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further training.
On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset.
arXiv Detail & Related papers (2020-12-02T16:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.