MambaIC: State Space Models for High-Performance Learned Image Compression
- URL: http://arxiv.org/abs/2503.12461v2
- Date: Thu, 20 Mar 2025 02:27:03 GMT
- Title: MambaIC: State Space Models for High-Performance Learned Image Compression
- Authors: Fanhu Zeng, Hao Tang, Yihua Shao, Siyu Chen, Ling Shao, Yan Wang,
- Abstract summary: A high-performance image compression algorithm is crucial for real-time information transmission across numerous fields.<n>Inspired by the effectiveness of state space models (SSMs) in capturing long-range dependencies, we leverage SSMs to address computational inefficiency in existing methods.<n>We propose an enhanced image compression approach through refined context modeling, which we term MambaIC.
- Score: 53.991726013454695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A high-performance image compression algorithm is crucial for real-time information transmission across numerous fields. Despite rapid progress in image compression, computational inefficiency and poor redundancy modeling still pose significant bottlenecks, limiting practical applications. Inspired by the effectiveness of state space models (SSMs) in capturing long-range dependencies, we leverage SSMs to address computational inefficiency in existing methods and improve image compression from multiple perspectives. In this paper, we integrate the advantages of SSMs for better efficiency-performance trade-off and propose an enhanced image compression approach through refined context modeling, which we term MambaIC. Specifically, we explore context modeling to adaptively refine the representation of hidden states. Additionally, we introduce window-based local attention into channel-spatial entropy modeling to reduce potential spatial redundancy during compression, thereby increasing efficiency. Comprehensive qualitative and quantitative results validate the effectiveness and efficiency of our approach, particularly for high-resolution image compression. Code is released at https://github.com/AuroraZengfh/MambaIC.
Related papers
- ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration [75.0053551643052]
We introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration.
ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens.
ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.
arXiv Detail & Related papers (2025-04-11T14:49:52Z) - Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations.
Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations.
Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z) - Pathology Image Compression with Pre-trained Autoencoders [52.208181380986524]
Whole Slide Images in digital histopathology pose significant storage, transmission, and computational efficiency challenges.<n>Standard compression methods, such as JPEG, reduce file sizes but fail to preserve fine-grained phenotypic details critical for downstream tasks.<n>In this work, we repurpose autoencoders (AEs) designed for Latent Diffusion Models as an efficient learned compression framework for pathology images.
arXiv Detail & Related papers (2025-03-14T17:01:17Z) - CMamba: Learned Image Compression with State Space Models [31.10785880342252]
We propose a hybrid Convolution and State Space Models (SSMs) based image compression framework to achieve superior rate-distortion performance.<n>Specifically, CMamba introduces two key components: a Content-Adaptive SSM (CA-SSM) module and a Context-Aware Entropy (CAE) module.<n> Experimental results demonstrate that CMamba achieves superior rate-distortion performance.
arXiv Detail & Related papers (2025-02-07T15:07:04Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.<n>Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - MambaVC: Learned Visual Compression with Selective State Spaces [74.29217829932895]
We introduce MambaVC, a simple, strong and efficient compression network based on SSM.
MambaVC develops a visual state space (VSS) block with a 2D selective scanning (2DSS) module as the nonlinear activation function after each downsampling.
On compression benchmark datasets, MambaVC achieves superior rate-distortion performance with lower computational and memory overheads.
arXiv Detail & Related papers (2024-05-24T10:24:30Z) - Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - Efficient Contextformer: Spatio-Channel Window Attention for Fast
Context Modeling in Learned Image Compression [1.9249287163937978]
We introduce the Efficient Contextformer (eContextformer) - a transformer-based autoregressive context model for learned image.
It fuses patch-wise, checkered, and channel-wise grouping techniques for parallel context modeling.
It achieves 145x lower model complexity and 210Cx faster decoding speed, and higher average bit savings on Kodak, CLI, and Tecnick datasets.
arXiv Detail & Related papers (2023-06-25T16:29:51Z) - Wavelet Feature Maps Compression for Image-to-Image CNNs [3.1542695050861544]
We propose a novel approach for high-resolution activation maps compression integrated with point-wise convolutions.
We achieve compression rates equivalent to 1-4bit activation quantization with relatively small and much more graceful degradation in performance.
arXiv Detail & Related papers (2022-05-24T20:29:19Z) - A Unified End-to-End Framework for Efficient Deep Image Compression [35.156677716140635]
We propose a unified framework called Efficient Deep Image Compression (EDIC) based on three new technologies.
Specifically, we design an auto-encoder style network for learning based image compression.
Our EDIC method can also be readily incorporated with the Deep Video Compression (DVC) framework to further improve the video compression performance.
arXiv Detail & Related papers (2020-02-09T14:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.