A Variational Information Bottleneck Based Method to Compress Sequential
Networks for Human Action Recognition
- URL: http://arxiv.org/abs/2010.01343v2
- Date: Mon, 9 Nov 2020 14:36:53 GMT
- Title: A Variational Information Bottleneck Based Method to Compress Sequential
Networks for Human Action Recognition
- Authors: Ayush Srivastava, Oshin Dutta, Prathosh AP, Sumeet Agarwal, Jigyasa
Gupta
- Abstract summary: We propose a method to effectively compress Recurrent Neural Networks (RNNs) used for Human Action Recognition (HAR)
We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset.
We combine our pruning method with a specific group-lasso regularization technique that significantly improves compression.
It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.
- Score: 9.414818018857316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the last few years, compression of deep neural networks has become an
important strand of machine learning and computer vision research. Deep models
require sizeable computational complexity and storage, when used for instance
for Human Action Recognition (HAR) from videos, making them unsuitable to be
deployed on edge devices. In this paper, we address this issue and propose a
method to effectively compress Recurrent Neural Networks (RNNs) such as Gated
Recurrent Units (GRUs) and Long-Short-Term-Memory Units (LSTMs) that are used
for HAR. We use a Variational Information Bottleneck (VIB) theory-based pruning
approach to limit the information flow through the sequential cells of RNNs to
a small subset. Further, we combine our pruning method with a specific
group-lasso regularization technique that significantly improves compression.
The proposed techniques reduce model parameters and memory footprint from
latent representations, with little or no reduction in the validation accuracy
while increasing the inference speed several-fold. We perform experiments on
the three widely used Action Recognition datasets, viz. UCF11, HMDB51, and
UCF101, to validate our approach. It is shown that our method achieves over 70
times greater compression than the nearest competitor with comparable accuracy
for the task of action recognition on UCF11.
Related papers
- Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - Learnable Mixed-precision and Dimension Reduction Co-design for
Low-storage Activation [9.838135675969026]
Deep convolutional neural networks (CNNs) have achieved many eye-catching results.
deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference.
We propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates compression policies according to their importance.
arXiv Detail & Related papers (2022-07-16T12:53:52Z) - Hybridization of Capsule and LSTM Networks for unsupervised anomaly
detection on multivariate data [0.0]
This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network.
The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data.
arXiv Detail & Related papers (2022-02-11T10:33:53Z) - a novel attention-based network for fast salient object detection [14.246237737452105]
In the current salient object detection network, the most popular method is using U-shape structure.
We propose a new deep convolution network architecture with three contributions.
Results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy.
arXiv Detail & Related papers (2021-12-20T12:30:20Z) - Differentiable Network Pruning for Microcontrollers [14.864940447206871]
We present a differentiable structured network pruning method for convolutional neural networks.
It integrates a model's MCU-specific resource usage and parameter importance feedback to obtain highly compressed yet accurate classification models.
arXiv Detail & Related papers (2021-10-15T20:26:15Z) - An Information Theory-inspired Strategy for Automatic Network Pruning [88.51235160841377]
Deep convolution neural networks are well known to be compressed on devices with resource constraints.
Most existing network pruning methods require laborious human efforts and prohibitive computation resources.
We propose an information theory-inspired strategy for automatic model compression.
arXiv Detail & Related papers (2021-08-19T07:03:22Z) - A New Clustering-Based Technique for the Acceleration of Deep
Convolutional Networks [2.7393821783237184]
Model Compression and Acceleration (MCA) techniques are used to transform large pre-trained networks into smaller models.
We propose a clustering-based approach that is able to increase the number of employed centroids/representatives.
This is achieved by imposing a special structure to the employed representatives, which is enabled by the particularities of the problem at hand.
arXiv Detail & Related papers (2021-07-19T18:22:07Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z) - A Generic Network Compression Framework for Sequential Recommender
Systems [71.81962915192022]
Sequential recommender systems (SRS) have become the key technology in capturing user's dynamic interests and generating high-quality recommendations.
We propose a compressed sequential recommendation framework, termed as CpRec, where two generic model shrinking techniques are employed.
By the extensive ablation studies, we demonstrate that the proposed CpRec can achieve up to 4$sim$8 times compression rates in real-world SRS datasets.
arXiv Detail & Related papers (2020-04-21T08:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.