Related papers: Efficient CNN-LSTM based Image Captioning using Neural Network Compression

Efficient CNN-LSTM based Image Captioning using Neural Network Compression

URL: http://arxiv.org/abs/2012.09708v1
Date: Thu, 17 Dec 2020 16:25:09 GMT
Title: Efficient CNN-LSTM based Image Captioning using Neural Network Compression
Authors: Harshit Rampal, Aman Mohanty
Abstract summary: We present an unconventional end to end compression pipeline of a CNN-LSTM based Image Captioning model. We then examine the effects of different compression architectures on the model and design a compression architecture that achieves a 73.1% reduction in model size.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern Neural Networks are eminent in achieving state of the art performance on tasks under Computer Vision, Natural Language Processing and related verticals. However, they are notorious for their voracious memory and compute appetite which further obstructs their deployment on resource limited edge devices. In order to achieve edge deployment, researchers have developed pruning and quantization algorithms to compress such networks without compromising their efficacy. Such compression algorithms are broadly experimented on standalone CNN and RNN architectures while in this work, we present an unconventional end to end compression pipeline of a CNN-LSTM based Image Captioning model. The model is trained using VGG16 or ResNet50 as an encoder and an LSTM decoder on the flickr8k dataset. We then examine the effects of different compression architectures on the model and design a compression architecture that achieves a 73.1% reduction in model size, 71.3% reduction in inference time and a 7.7% increase in BLEU score as compared to its uncompressed counterpart.

Related papers

GANCompress: GAN-Enhanced Neural Image Compression with Binary Spherical Quantization [0.0]
GANCompress is a novel neural compression framework that combines Binary Spherical Quantization (BSQ) with Generative Adversarial Networks (GANs)<n>We show that GANCompress achieves substantial improvement in compression efficiency -- reducing file sizes by up to 100x with minimal visual distortion.
arXiv Detail & Related papers (2025-05-19T00:18:27Z)
Computer Vision Model Compression Techniques for Embedded Systems: A Survey [75.38606213726906]
This paper covers the main model compression techniques applied for computer vision tasks. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges.
arXiv Detail & Related papers (2024-08-15T16:41:55Z)
Dynamic Semantic Compression for CNN Inference in Multi-access Edge Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z)
Attention-based Feature Compression for CNN Inference Offloading in Edge Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems. We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device. Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z)
The Devil Is in the Details: Window-based Attention for Image Compression [58.1577742463617]
Most existing learned image compression models are based on Convolutional Neural Networks (CNNs) In this paper, we study the effects of multiple kinds of attention mechanisms for local features learning, then introduce a more straightforward yet effective window-based local attention block. The proposed window-based attention is very flexible which could work as a plug-and-play component to enhance CNN and Transformer models.
arXiv Detail & Related papers (2022-03-16T07:55:49Z)
Exploring Structural Sparsity in Neural Image Compression [14.106763725475469]
We propose a plug-in adaptive binary channel masking(ABCM) to judge the importance of each convolution channel and introduce sparsity during training. During inference, the unimportant channels are pruned to obtain slimmer network and less computation. Experiment results show that up to 7x computation reduction and 3x acceleration can be achieved with negligible performance drop.
arXiv Detail & Related papers (2022-02-09T17:46:49Z)
Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends. Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z)
Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression [26.501979992447605]
This paper investigates compression from the perspective of compactly representing and storing trained parameters. We leverage additive quantization, an extreme lossy compression method invented for image descriptors, to compactly represent the parameters. We conduct experiments on MobileNet-v2, VGG-11, ResNet-50, Feature Pyramid Networks, and pruned DNNs trained for classification, detection, and segmentation tasks.
arXiv Detail & Related papers (2021-11-19T17:03:11Z)
Joint Matrix Decomposition for Deep Convolutional Neural Networks Compression [5.083621265568845]
Deep convolutional neural networks (CNNs) with a large number of parameters requires huge computational resources. Decomposition-based methods, therefore, have been utilized to compress CNNs in recent years. We propose to compress CNNs and alleviate performance degradation via joint matrix decomposition.
arXiv Detail & Related papers (2021-07-09T12:32:10Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models. We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)
On the Impact of Lossy Image and Video Compression on the Performance of Deep Convolutional Neural Network Architectures [17.349420462716886]
This study investigates the impact of commonplace image and video compression techniques on the performance of deep learning architectures. We examine the impact on performance across five discrete tasks: human pose estimation, semantic segmentation, object detection, action recognition, and monocular depth estimation. Results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied.
arXiv Detail & Related papers (2020-07-28T15:37:37Z)
Compression strategies and space-conscious representations for deep neural networks [0.3670422696827526]
Recent advances in deep learning have made available powerful convolutional neural networks (CNN) with state-of-the-art performance in several real-world applications. CNNs have millions of parameters, thus they are not deployable on resource-limited platforms. In this paper, we investigate the impact of lossy compression of CNNs by weight pruning and quantization.
arXiv Detail & Related papers (2020-07-15T19:41:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.