Lossy and Lossless (L$^2$) Post-training Model Size Compression
- URL: http://arxiv.org/abs/2308.04269v1
- Date: Tue, 8 Aug 2023 14:10:16 GMT
- Title: Lossy and Lossless (L$^2$) Post-training Model Size Compression
- Authors: Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang
- Abstract summary: We propose a post-training model size compression method that combines lossy and lossless compression in a unified way.
Our method can achieve a stable $10times$ compression ratio without sacrificing accuracy and a $20times$ compression ratio with minor accuracy loss in a short time.
- Score: 12.926354646945397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have delivered remarkable performance and have been
widely used in various visual tasks. However, their huge size causes
significant inconvenience for transmission and storage. Many previous studies
have explored model size compression. However, these studies often approach
various lossy and lossless compression methods in isolation, leading to
challenges in achieving high compression ratios efficiently. This work proposes
a post-training model size compression method that combines lossy and lossless
compression in a unified way. We first propose a unified parametric weight
transformation, which ensures different lossy compression methods can be
performed jointly in a post-training manner. Then, a dedicated differentiable
counter is introduced to guide the optimization of lossy compression to arrive
at a more suitable point for later lossless compression. Additionally, our
method can easily control a desired global compression ratio and allocate
adaptive ratios for different layers. Finally, our method can achieve a stable
$10\times$ compression ratio without sacrificing accuracy and a $20\times$
compression ratio with minor accuracy loss in a short time. Our code is
available at https://github.com/ModelTC/L2_Compression .
Related papers
- ZipNN: Lossless Compression for AI Models [10.111136691015554]
We present ZipNN a lossless compression tailored to neural networks.
On popular models (e.g. Llama 3) ZipNN shows space savings that are over 17% better than vanilla compression.
We estimate that these methods could save over an ExaByte per month of network traffic downloaded from a large model hub like Hugging Face.
arXiv Detail & Related papers (2024-11-07T23:28:23Z) - Order of Compression: A Systematic and Optimal Sequence to Combinationally Compress CNN [5.25545980258284]
We propose a systematic and optimal sequence to apply multiple compression techniques in the most effective order.
Our proposed Order of Compression significantly reduces computational costs by up to 859 times on ResNet34, with negligible accuracy loss.
We believe our simple yet effective exploration of the order of compression will shed light on the practice of model compression.
arXiv Detail & Related papers (2024-03-26T07:26:00Z) - DiffRate : Differentiable Compression Rate for Efficient Vision
Transformers [98.33906104846386]
Token compression aims to speed up large-scale vision transformers (e.g. ViTs) by pruning (dropping) or merging tokens.
DiffRate is a novel token compression method that has several appealing properties prior arts do not have.
arXiv Detail & Related papers (2023-05-29T10:15:19Z) - Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image
Compression [85.93207826513192]
We propose a unified and powerful deep lossy plus residual (DLPR) coding framework for both lossless and near-lossless image compression.
We solve the joint lossy and residual compression problem in the approach of VAEs.
In the near-lossless mode, we quantize the original residuals to satisfy a given $ell_infty$ error bound.
arXiv Detail & Related papers (2022-09-11T12:11:56Z) - Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models.
Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z) - Learning Scalable $\ell_\infty$-constrained Near-lossless Image
Compression via Joint Lossy Image and Residual Compression [118.89112502350177]
We propose a novel framework for learning $ell_infty$-constrained near-lossless image compression.
We derive the probability model of the quantized residual by quantizing the learned probability model of the original residual.
arXiv Detail & Related papers (2021-03-31T11:53:36Z) - Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets.
We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z) - Uncertainty Principle for Communication Compression in Distributed and
Federated Learning and the Search for an Optimal Compressor [5.09755285351264]
We consider an unbiased compression method inspired by the Kashin representation of vectors, which we call em Kashin compression (KC).
KC enjoys a em dimension independent variance bound for which we derive an explicit formula even in the regime when only a few bits need to be communicate per each vector entry.
arXiv Detail & Related papers (2020-02-20T17:20:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.