Compression-aware Continual Learning using Singular Value Decomposition
- URL: http://arxiv.org/abs/2009.01956v2
- Date: Mon, 14 Sep 2020 22:29:21 GMT
- Title: Compression-aware Continual Learning using Singular Value Decomposition
- Authors: Varigonda Pavan Teja, and Priyadarshini Panda
- Abstract summary: We propose a compression based continual task learning method that can dynamically grow a neural network.
Inspired by the recent model compression techniques, we employ compression-aware training and perform low-rank weight approximations.
Our method achieves compressed representations with minimal performance degradation without the need for costly fine-tuning.
- Score: 2.4283778735260686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a compression based continual task learning method that can
dynamically grow a neural network. Inspired from the recent model compression
techniques, we employ compression-aware training and perform low-rank weight
approximations using singular value decomposition (SVD) to achieve network
compaction. By encouraging the network to learn low-rank weight filters, our
method achieves compressed representations with minimal performance degradation
without the need for costly fine-tuning. Specifically, we decompose the weight
filters using SVD and train the network on incremental tasks in its factorized
form. Such a factorization allows us to directly impose sparsity-inducing
regularizers over the singular values and allows us to use fewer number of
parameters for each task. We further introduce a novel shared representational
space based learning between tasks. This promotes the incoming tasks to only
learn residual task-specific information on top of the previously learnt weight
filters and greatly helps in learning under fixed capacity constraints. Our
method significantly outperforms prior continual learning approaches on three
benchmark datasets, demonstrating accuracy improvements of 10.3%, 12.3%, 15.6%
on 20-split CIFAR-100, miniImageNet and a 5-sequence dataset, respectively,
over state-of-the-art. Further, our method yields compressed models that have
~3.64x, 2.88x, 5.91x fewer number of parameters respectively, on the above
mentioned datasets in comparison to baseline individual task models. Our source
code is available at https://github.com/pavanteja295/CACL.
Related papers
- Compression of Recurrent Neural Networks using Matrix Factorization [0.9208007322096533]
We propose a post-training rank-selection method called Rank-Tuning that selects a different rank for each matrix.
Our numerical experiments on signal processing tasks show that we can compress recurrent neural networks up to 14x with at most 1.4% relative performance reduction.
arXiv Detail & Related papers (2023-10-19T12:35:30Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Peeling the Onion: Hierarchical Reduction of Data Redundancy for
Efficient Vision Transformer Training [110.79400526706081]
Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage limit their generalization.
Previous compression algorithms usually start from the pre-trained dense models and only focus on efficient inference.
This paper proposes an end-to-end efficient training framework from three sparse perspectives, dubbed Tri-Level E-ViT.
arXiv Detail & Related papers (2022-11-19T21:15:47Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - Rectification-based Knowledge Retention for Continual Learning [49.1447478254131]
Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting.
We propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner.
Our approach can be used in both the zero-shot and non zero-shot task incremental learning settings.
arXiv Detail & Related papers (2021-03-30T18:11:30Z) - Layer-Wise Data-Free CNN Compression [49.73757297936685]
We show how to generate layer-wise training data using only a pretrained network.
We present results for layer-wise compression using quantization and pruning.
arXiv Detail & Related papers (2020-11-18T03:00:05Z) - A Variational Information Bottleneck Based Method to Compress Sequential
Networks for Human Action Recognition [9.414818018857316]
We propose a method to effectively compress Recurrent Neural Networks (RNNs) used for Human Action Recognition (HAR)
We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset.
We combine our pruning method with a specific group-lasso regularization technique that significantly improves compression.
It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.
arXiv Detail & Related papers (2020-10-03T12:41:51Z) - End-to-end Learning of Compressible Features [35.40108701875527]
Pre-trained convolutional neural networks (CNNs) are powerful off-the-shelf feature generators.
CNNs are powerful off-the-shelf feature generators and have been shown to perform very well on a variety of tasks.
Unfortunately, the generated features are high dimensional and expensive to store.
We propose a learned method that jointly optimize for compressibility along with the task objective.
arXiv Detail & Related papers (2020-07-23T05:17:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.