Related papers: MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs

MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs

URL: http://arxiv.org/abs/2408.11758v2
Date: Tue, 26 Nov 2024 07:32:19 GMT
Title: MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs
Authors: Yulin Ren, Xin Li, Mengxi Guo, Bingchen Li, Shijie Zhao, Zhibo Chen,
Abstract summary: MambaCSR is a framework based on Mamba for the challenging compressed image super-resolution (CSR) task.<n>We propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies.<n>Results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task.
Score: 14.42424591513825
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present MambaCSR, a simple but effective framework based on Mamba for the challenging compressed image super-resolution (CSR) task. Particularly, the scanning strategies of Mamba are crucial for effective contextual knowledge modeling in the restoration process despite it relying on selective state space modeling for all tokens. In this work, we propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies: (i) hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods; (ii) horizontal-to-vertical interleaved scanning is proposed to reduce the computational cost by leaving the redundancy between the scanning of different directions. To overcome the non-uniform compression artifacts, we also propose position-aligned cross-scale scanning to model multi-scale contextual information. Experimental results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task. The code will be soon available in~\textcolor{magenta}{\url{https://github.com/renyulin-f/MambaCSR}}.

Related papers

EAMamba: Efficient All-Around Vision State Space Model for Image Restoration [11.190025966582041]
This study introduces Efficient All-Around Mamba (EAMamba), an enhanced framework that incorporates a Multi-Head Selective Scan Module (MHSSM) with an all-around scanning mechanism.<n>EAMamba achieves a significant 31-89% reduction in FLOPs while maintaining favorable performance compared to existing low-level Vision Mamba methods.
arXiv Detail & Related papers (2025-06-27T14:12:58Z)
DefMamba: Deformable Visual State Space Model [65.50381013020248]
We propose a novel visual foundation model called DefMamba. By combining a deformable scanning(DS) strategy, this model significantly improves its ability to learn image structures and detects changes in object details. Numerous experiments have shown that DefMamba achieves state-of-the-art performance in various visual tasks.
arXiv Detail & Related papers (2025-04-08T08:22:54Z)
MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration [24.66368406718623]
We propose a novel Mamba-based Image Restoration model (MaIR) MaIR consists of Nested S-shaped Scanning strategy (NSS) and Sequence Shuffle Attention block (SSA) Thanks to NSS and SSA, MaIR surpasses 40 baselines across 14 challenging datasets.
arXiv Detail & Related papers (2024-12-28T07:40:39Z)
XYScanNet: A State Space Model for Single Image Deblurring [6.9752432140704705]
Deep state-space models (SSMs) are emerging as a promising alternative to CNN and Transformer networks. We propose a novel slice-and-scan strategy that alternates scanning along intra-blur and inter-slices. We develop XYScanNet, an SSM architecture integrated with a lightweight feature fusion module for enhanced image deblurring.
arXiv Detail & Related papers (2024-12-13T18:33:18Z)
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [95.98801201266099]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.<n>We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.<n>Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z)
MambaIRv2: Attentive State Space Restoration [96.4452232356586]
We propose MambaIRv2, which equips Mamba with the non-causal modeling ability similar to ViTs to reach the attentive state space restoration model. Specifically, the proposed attentive state-space equation allows to attend beyond the scanned sequence and facilitate image unfolding with just one single scan.
arXiv Detail & Related papers (2024-11-22T12:45:12Z)
Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution [42.259283231048954]
State Space Models (SSM) have shown strong representation ability in modeling long-range dependency with linear complexity. We propose a novel Hierarchical Mamba network, namely, Hi-Mamba, for image super-resolution (SR)
arXiv Detail & Related papers (2024-10-14T04:15:04Z)
Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding. Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z)
MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba [0.43512163406552]
State Space Models (SSMs) with Mamba have shown great promise for long-range dependency modeling with linear complexity. To effectively organize and construct visual features within the 2D image space through 1D selective scan, we propose a novel Multi-Head Scan (MHS) module. The resulting sub-embeddings, obtained from the multi-head scan process, are then integrated and ultimately projected back into the high-dimensional space.
arXiv Detail & Related papers (2024-06-10T03:24:43Z)
Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study [7.334290421966221]
We investigate the impact of mainstream scanning directions and their combinations on semantic segmentation of images. A simple, single scanning direction is deemed sufficient for semantic segmentation of high-resolution remotely sensed images.
arXiv Detail & Related papers (2024-05-14T10:36:56Z)
Super-Resolution on Rotationally Scanned Photoacoustic Microscopy Images Incorporating Scanning Prior [12.947842858489516]
Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies. There exists a trade-off between scanning speed and image resolution. Compared with traditional scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism. In this study, we propose a novel and well-performing super-resolution framework for rotational scanning-based PAM imaging.
arXiv Detail & Related papers (2023-12-12T12:41:35Z)
Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels. They are not widely adopted by general users due to their substantial storage requirements. We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z)
Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video [111.08121952640766]
This paper presents a novel deep-learning based solution to the RS temporal super-resolution problem. By leveraging the multi-view geometry relationship of the RS imaging process, our framework successfully achieves high framerate GS generation. Our method can produce high-quality GS image sequences with rich details, outperforming the state-of-the-art methods.
arXiv Detail & Related papers (2022-10-06T16:47:12Z)
SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-Resolution [39.77987463287673]
We present an efficient paradigm to perform Simultaneously Image Colorization and Super-resolution (SCS) The proposed method consists of two parts: colorization branch for learning color information that employs the proposed plug-and-play emphPyramid Valve Cross Attention (PVCAttn) module. Our SCSNet supports both automatic and referential modes that is more flexible for practical application.
arXiv Detail & Related papers (2022-01-12T08:59:12Z)
Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision. In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks. For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC) Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z)
DDet: Dual-path Dynamic Enhancement Network for Real-World Image Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image. In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR. Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.