CMIC: Content-Adaptive Mamba for Learned Image Compression
- URL: http://arxiv.org/abs/2508.02192v2
- Date: Tue, 05 Aug 2025 13:36:08 GMT
- Title: CMIC: Content-Adaptive Mamba for Learned Image Compression
- Authors: Yunuo Chen, Zezheng Lyu, Bing He, Hongwei Hu, Qi Wang, Yuan Tian, Li Song, Wenjun Zhang, Guo Lu,
- Abstract summary: Recent Learned image compression (LIC) leverages Mamba-style state-space models (SSMs) for global fields with linear complexity.<n>We introduce Content-Adaptive Mamba (CAM), a dynamic SSM that addresses two critical limitations.<n>CAM employs content-aware token reorganization, clustering and reordering tokens based on content similarity to prioritize proximity in feature space over Euclidean space.
- Score: 28.348742499973493
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent Learned image compression (LIC) leverages Mamba-style state-space models (SSMs) for global receptive fields with linear complexity. However, vanilla Mamba is content-agnostic, relying on fixed and predefined selective scans, which restricts its ability to dynamically and fully exploit content dependencies. We introduce Content-Adaptive Mamba (CAM), a dynamic SSM that addresses two critical limitations. First, it employs content-aware token reorganization, clustering and reordering tokens based on content similarity to prioritize proximity in feature space over Euclidean space. Second, it integrates global priors into SSM via a prompt dictionary, effectively mitigating the strict causality and long-range decay in the token interactions of Mamba. These innovations enable CAM to better capture global dependencies while preserving computational efficiency. Leveraging CAM, our Content-Adaptive Mamba-based LIC model (CMIC) achieves state-of-the-art rate-distortion performance, surpassing VTM-21.0 by -15.91\%, -21.34\%, and -17.58\% BD-rate on Kodak, Tecnick, and CLIC benchmarks, respectively.
Related papers
- A2Mamba: Attention-augmented State Space Models for Visual Recognition [45.68176825375723]
We propose A2Mamba, a powerful Transformer-Mamba hybrid network architecture.<n>A key step of A2SSM performs a variant of cross-attention by spatially aggregating the SSM's hidden states.<n>Our A2Mamba outperforms all previous ConvNet-, Transformer-, and Mamba-based architectures in visual recognition tasks.
arXiv Detail & Related papers (2025-07-22T14:17:08Z) - MambaVSR: Content-Aware Scanning State Space Model for Video Super-Resolution [33.457410717030946]
We propose MambaVSR, the first state-space model framework for super-resolution video.<n>MambaVSR enables dynamic interactions through the Shared Compass Construction ( SCC) and the Content-Aware Sequentialization (CAS)<n>Building upon, the CAS module effectively aligns and aggregates non-local similar content across multiple frames by interleaving temporal features along the learned spatial order.
arXiv Detail & Related papers (2025-06-13T13:22:28Z) - RD-UIE: Relation-Driven State Space Modeling for Underwater Image Enhancement [59.364418120895]
Underwater image enhancement (UIE) is a critical preprocessing step for marine vision applications.<n>We develop a novel relation-driven Mamba framework for effective UIE (RD-UIE)<n>Experiments on underwater enhancement benchmarks demonstrate RD-UIE outperforms the state-of-the-art approach WMamba.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - HS-Mamba: Full-Field Interaction Multi-Groups Mamba for Hyperspectral Image Classification [1.9526430269580959]
We propose a full-field interaction multi-groups Mamba framework (HS-Mamba) for classification of hyperspectral images.<n>HS-Mamba consists of a dual-channel spatial-spectral encoder (DCSS-encoder) module and a lightweight global inline attention (LGI-Att) branch.<n>Extensive experiments demonstrate the superiority of the proposed HS-Mamba, outperforming state-of-the-art methods on four benchmark HSI datasets.
arXiv Detail & Related papers (2025-04-22T06:13:02Z) - Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement [54.427965535613886]
Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision.<n>In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
arXiv Detail & Related papers (2024-12-21T13:43:51Z) - Mamba-CL: Optimizing Selective State Space Model in Null Space for Continual Learning [54.19222454702032]
Continual Learning aims to equip AI models with the ability to learn a sequence of tasks over time, without forgetting previously learned knowledge.<n>State Space Models (SSMs) have achieved notable success in computer vision.<n>We introduce Mamba-CL, a framework that continuously fine-tunes the core SSMs of the large-scale Mamba foundation model.
arXiv Detail & Related papers (2024-11-23T06:36:16Z) - MambaIRv2: Attentive State Space Restoration [96.4452232356586]
Mamba-based image restoration backbones have recently demonstrated significant potential in balancing global reception and computational efficiency.<n>We propose MambaIRv2, which equips Mamba with the non-causal modeling ability similar to ViTs to reach the attentive state space restoration model.
arXiv Detail & Related papers (2024-11-22T12:45:12Z) - StableMamba: Distillation-free Scaling of Large SSMs for Images and Videos [27.604572990625144]
State-space models (SSMs) have introduced a novel context modeling method by integrating state-space techniques into deep learning.<n>Mamba-based architectures are difficult to scale with respect to the number of parameters, which is a major limitation for vision applications.<n>We propose a Mamba-Attention interleaved architecture that enhances scalability, robustness, and performance.
arXiv Detail & Related papers (2024-09-18T10:48:10Z) - SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - GroupMamba: Efficient Group-Based Visual State Space Model [66.35608254724566]
State-space models (SSMs) have recently shown promise in capturing long-range dependencies with subquadratic computational complexity.<n>However, purely SSM-based models face critical challenges related to stability and achieving state-of-the-art performance in computer vision tasks.<n>Our paper addresses the challenges of scaling SSM-based models for computer vision, particularly the instability and inefficiency of large model sizes.
arXiv Detail & Related papers (2024-07-18T17:59:58Z) - MambaVC: Learned Visual Compression with Selective State Spaces [74.29217829932895]
We introduce MambaVC, a simple, strong and efficient compression network based on SSM.
MambaVC develops a visual state space (VSS) block with a 2D selective scanning (2DSS) module as the nonlinear activation function after each downsampling.
On compression benchmark datasets, MambaVC achieves superior rate-distortion performance with lower computational and memory overheads.
arXiv Detail & Related papers (2024-05-24T10:24:30Z) - Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification [4.389334324926174]
This study introduces the innovative Mamba-in-Mamba (MiM) architecture for HSI classification, the first attempt of deploying State Space Model (SSM) in this task.
MiM model includes 1) A novel centralized Mamba-Cross-Scan (MCS) mechanism for transforming images into sequence-data, 2) A Tokenized Mamba (T-Mamba) encoder, and 3) A Weighted MCS Fusion (WMF) module.
Experimental results from three public HSI datasets demonstrate that our method outperforms existing baselines and state-of-the-art approaches.
arXiv Detail & Related papers (2024-05-20T13:19:02Z) - PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition [21.761988930589727]
PlainMamba is a simple non-hierarchical state space model (SSM) designed for general visual recognition.
We adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images.
Our architecture is designed to be easy to use and easy to scale, formed by stacking identical PlainMamba blocks.
arXiv Detail & Related papers (2024-03-26T13:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.