Multistage Spatial Context Models for Learned Image Compression
- URL: http://arxiv.org/abs/2302.09263v1
- Date: Sat, 18 Feb 2023 08:55:54 GMT
- Title: Multistage Spatial Context Models for Learned Image Compression
- Authors: Fangzheng Lin, Heming Sun, Jinming Liu, Jiro Katto
- Abstract summary: We present a series of multistage spatial context models allowing both fast decoding and better RD performance.
The proposed method features a comparable decoding speed to Checkerboard while reaching the RD performance of Autoregressive.
- Score: 19.15884180604451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent state-of-the-art Learned Image Compression methods feature spatial
context models, achieving great rate-distortion improvements over hyperprior
methods. However, the autoregressive context model requires serial decoding,
limiting runtime performance. The Checkerboard context model allows parallel
decoding at a cost of reduced RD performance. We present a series of multistage
spatial context models allowing both fast decoding and better RD performance.
We split the latent space into square patches and decode serially within each
patch while different patches are decoded in parallel. The proposed method
features a comparable decoding speed to Checkerboard while reaching the RD
performance of Autoregressive and even also outperforming Autoregressive.
Inside each patch, the decoding order must be carefully decided as a bad order
negatively impacts performance; therefore, we also propose a decoding order
optimization algorithm.
Related papers
- LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy [59.1298692559785]
Key-Value ( KV) cache is crucial component in serving transformer-based autoregressive large language models (LLMs)
Existing approaches to mitigate this issue include: (1) efficient attention variants integrated in upcycling stages; (2) KV cache compression at test time; and (3) KV cache compression at test time.
We propose a low-rank approximation of KV weight matrices, allowing plug-in integration with existing transformer-based LLMs without model retraining.
Our method is designed to function without model tuning in upcycling stages or task-specific profiling in test stages.
arXiv Detail & Related papers (2024-10-04T03:10:53Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster [61.83949316226113]
FastCoT is a model-agnostic framework based on parallel decoding.
We show that FastCoT saves inference time by nearly 20% with only a negligible performance drop compared to the regular approach.
arXiv Detail & Related papers (2023-11-14T15:56:18Z) - Efficient Contextformer: Spatio-Channel Window Attention for Fast
Context Modeling in Learned Image Compression [1.9249287163937978]
We introduce the Efficient Contextformer (eContextformer) - a transformer-based autoregressive context model for learned image.
It fuses patch-wise, checkered, and channel-wise grouping techniques for parallel context modeling.
It achieves 145x lower model complexity and 210Cx faster decoding speed, and higher average bit savings on Kodak, CLI, and Tecnick datasets.
arXiv Detail & Related papers (2023-06-25T16:29:51Z) - Accelerating Transformer Inference for Translation via Parallel Decoding [2.89306442817912]
Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT)
We present three parallel decoding algorithms and test them on different languages and models.
arXiv Detail & Related papers (2023-05-17T17:57:34Z) - MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers [78.85346970193518]
Megabyte is a multi-scale decoder architecture that enables end-to-end differentiable modeling of sequences of over one million bytes.
Experiments show that Megabyte allows byte-level models to perform competitively with subword models on long context language modeling.
Results establish the viability of tokenization-free autoregressive sequence modeling at scale.
arXiv Detail & Related papers (2023-05-12T00:55:41Z) - Split Hierarchical Variational Compression [21.474095984110622]
Variational autoencoders (VAEs) have witnessed great success in performing the compression of image datasets.
SHVC introduces an efficient autoregressive sub-pixel convolution, that allows a generalisation between per-pixel autoregressions and fully factorised probability models.
arXiv Detail & Related papers (2022-04-05T09:13:38Z) - Checkerboard Context Model for Efficient Learned Image Compression [6.376339829493938]
For learned image compression, the autoregressive context model is proved effective in improving the rate-distortion (RD) performance.
We propose a parallelizable checkerboard context model (CCM) to solve the problem.
Speeding up the decoding process more than 40 times in our experiments, it significantly improved computational efficiency with almost the same rate-distortion performance.
arXiv Detail & Related papers (2021-03-29T03:25:41Z) - Scaling Distributed Deep Learning Workloads beyond the Memory Capacity
with KARMA [58.040931661693925]
We propose a strategy that combines redundant recomputing and out-of-core methods.
We achieve an average of 1.52x speedup in six different models over the state-of-the-art out-of-core methods.
Our data parallel out-of-core solution can outperform complex hybrid model parallelism in training large models, e.g. Megatron-LM and Turning-NLG.
arXiv Detail & Related papers (2020-08-26T07:24:34Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.