Related papers: Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity

Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity

URL: http://arxiv.org/abs/2105.01869v1
Date: Wed, 5 May 2021 05:14:50 GMT
Title: Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Authors: Baeseong Park, Se Jung Kwon, Dongsoo Lee, Daehwan Oh, Byeongwook Kim, Yongkweon Jeon, Yeonju Ro
Abstract summary: We study fixed-to-fixed encryption architecture/algorithm to support fine-grained pruning methods. We demonstrate that our proposed compression scheme achieves almost the maximum compression ratio for the Transformer and ResNet-50.
Score: 9.062897838978955
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Even though fine-grained pruning techniques achieve a high compression ratio, conventional sparsity representations (such as CSR) associated with irregular sparsity degrade parallelism significantly. Practical pruning methods, thus, usually lower pruning rates (by structured pruning) to improve parallelism. In this paper, we study fixed-to-fixed (lossless) encryption architecture/algorithm to support fine-grained pruning methods such that sparse neural networks can be stored in a highly regular structure. We first estimate the maximum compression ratio of encryption-based compression using entropy. Then, as an effort to push the compression ratio to the theoretical maximum (by entropy), we propose a sequential fixed-to-fixed encryption scheme. We demonstrate that our proposed compression scheme achieves almost the maximum compression ratio for the Transformer and ResNet-50 pruned by various fine-grained pruning methods.

Related papers

Order of Compression: A Systematic and Optimal Sequence to Combinationally Compress CNN [5.25545980258284]
We propose a systematic and optimal sequence to apply multiple compression techniques in the most effective order. Our proposed Order of Compression significantly reduces computational costs by up to 859 times on ResNet34, with negligible accuracy loss. We believe our simple yet effective exploration of the order of compression will shed light on the practice of model compression.
arXiv Detail & Related papers (2024-03-26T07:26:00Z)
Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image Compression [2.1485350418225244]
End-to-end deep trainable models are about to exceed the performance of the traditional handcrafted compression techniques on videos and images. We propose a simple yet efficient instance-based parameterization method to reduce this amortization gap at a minor cost.
arXiv Detail & Related papers (2022-09-02T11:43:45Z)
Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types. We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding. We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)
Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition. A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression. For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models. We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)
Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding [45.66971406567023]
bits-back suffers from an increase in the equal to the KL divergence between the approximate posterior and the true posterior. We show how to remove this gapally by deriving bits-back coding algorithms from tighter variational bounds.
arXiv Detail & Related papers (2021-02-22T14:58:01Z)
Successive Pruning for Model Compression via Rate Distortion Theory [15.598364403631528]
We study NN compression from an information-theoretic approach and show that rate distortion theory suggests pruning to achieve the theoretical limits of NN compression. Our derivation also provides an end-to-end compression pipeline involving a novel pruning strategy. Our method consistently outperforms the existing pruning strategies and reduces the pruned model's size by 2.5 times.
arXiv Detail & Related papers (2021-02-16T18:17:57Z)
Unfolding Neural Networks for Compressive Multichannel Blind Deconvolution [71.29848468762789]
We propose a learned-structured unfolding neural network for the problem of compressive sparse multichannel blind-deconvolution. In this problem, each channel's measurements are given as convolution of a common source signal and sparse filter. We demonstrate that our method is superior to classical structured compressive sparse multichannel blind-deconvolution methods in terms of accuracy and speed of sparse filter recovery.
arXiv Detail & Related papers (2020-10-22T02:34:33Z)
Linear Convergent Decentralized Optimization with Compression [50.44269451541387]
Existing decentralized algorithms with compression mainly focus on compressing DGD-type algorithms. Motivated by primal-dual algorithms, this paper proposes first underlineLinunderlineEAr convergent. underlineDecentralized with compression, LEAD.
arXiv Detail & Related papers (2020-07-01T04:35:00Z)
A Generic Network Compression Framework for Sequential Recommender Systems [71.81962915192022]
Sequential recommender systems (SRS) have become the key technology in capturing user's dynamic interests and generating high-quality recommendations. We propose a compressed sequential recommendation framework, termed as CpRec, where two generic model shrinking techniques are employed. By the extensive ablation studies, we demonstrate that the proposed CpRec can achieve up to 4$sim$8 times compression rates in real-world SRS datasets.
arXiv Detail & Related papers (2020-04-21T08:40:55Z)
Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression [145.04742985050808]
We analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense. By changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly. Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.
arXiv Detail & Related papers (2020-03-19T17:57:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.