MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs
- URL: http://arxiv.org/abs/2408.11758v1
- Date: Wed, 21 Aug 2024 16:30:45 GMT
- Title: MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs
- Authors: Yulin Ren, Xin Li, Mengxi Guo, Bingchen Li, Shijie Zhao, Zhibo Chen,
- Abstract summary: MambaCSR is a framework based on Mamba for the challenging compressed image super-resolution (CSR) task.
We propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies.
Results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task.
- Score: 14.42424591513825
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present MambaCSR, a simple but effective framework based on Mamba for the challenging compressed image super-resolution (CSR) task. Particularly, the scanning strategies of Mamba are crucial for effective contextual knowledge modeling in the restoration process despite it relying on selective state space modeling for all tokens. In this work, we propose an efficient dual-interleaved scanning paradigm (DIS) for CSR, which is composed of two scanning strategies: (i) hierarchical interleaved scanning is designed to comprehensively capture and utilize the most potential contextual information within an image by simultaneously taking advantage of the local window-based and sequential scanning methods; (ii) horizontal-to-vertical interleaved scanning is proposed to reduce the computational cost by leaving the redundancy between the scanning of different directions. To overcome the non-uniform compression artifacts, we also propose position-aligned cross-scale scanning to model multi-scale contextual information. Experimental results on multiple benchmarks have shown the great performance of our MambaCSR in the compressed image super-resolution task. The code will be soon available in~\textcolor{magenta}{\url{https://github.com/renyulin-f/MambaCSR}}.
Related papers
- V2M: Visual 2-Dimensional Mamba for Image Representation Learning [68.51380287151927]
Mamba has garnered widespread attention due to its flexible design and efficient hardware performance to process 1D sequences.
Recent studies have attempted to apply Mamba to the visual domain by flattening 2D images into patches and then regarding them as a 1D sequence.
We propose a Visual 2-Dimensional Mamba model as a complete solution, which directly processes image tokens in the 2D space.
arXiv Detail & Related papers (2024-10-14T11:11:06Z) - Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution [42.259283231048954]
State Space Models (SSM) have shown strong representation ability in modeling long-range dependency with linear complexity.
We propose a novel Hierarchical Mamba network, namely, Hi-Mamba, for image super-resolution (SR)
arXiv Detail & Related papers (2024-10-14T04:15:04Z) - MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba [0.43512163406552]
State Space Models (SSMs) with Mamba have shown great promise for long-range dependency modeling with linear complexity.
To effectively organize and construct visual features within the 2D image space through 1D selective scan, we propose a novel Multi-Head Scan (MHS) module.
The resulting sub-embeddings, obtained from the multi-head scan process, are then integrated and ultimately projected back into the high-dimensional space.
arXiv Detail & Related papers (2024-06-10T03:24:43Z) - Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study [7.334290421966221]
We investigate the impact of mainstream scanning directions and their combinations on semantic segmentation of images.
A simple, single scanning direction is deemed sufficient for semantic segmentation of high-resolution remotely sensed images.
arXiv Detail & Related papers (2024-05-14T10:36:56Z) - Super-Resolution on Rotationally Scanned Photoacoustic Microscopy Images
Incorporating Scanning Prior [12.947842858489516]
Photoacoustic Microscopy (PAM) images integrating the advantages of optical contrast and acoustic resolution have been widely used in brain studies.
There exists a trade-off between scanning speed and image resolution. Compared with traditional scanning, rotational scanning provides good opportunities for fast PAM imaging by optimizing the scanning mechanism.
In this study, we propose a novel and well-performing super-resolution framework for rotational scanning-based PAM imaging.
arXiv Detail & Related papers (2023-12-12T12:41:35Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Rolling Shutter Inversion: Bring Rolling Shutter Images to High
Framerate Global Shutter Video [111.08121952640766]
This paper presents a novel deep-learning based solution to the RS temporal super-resolution problem.
By leveraging the multi-view geometry relationship of the RS imaging process, our framework successfully achieves high framerate GS generation.
Our method can produce high-quality GS image sequences with rich details, outperforming the state-of-the-art methods.
arXiv Detail & Related papers (2022-10-06T16:47:12Z) - SCSNet: An Efficient Paradigm for Learning Simultaneously Image
Colorization and Super-Resolution [39.77987463287673]
We present an efficient paradigm to perform Simultaneously Image Colorization and Super-resolution (SCS)
The proposed method consists of two parts: colorization branch for learning color information that employs the proposed plug-and-play emphPyramid Valve Cross Attention (PVCAttn) module.
Our SCSNet supports both automatic and referential modes that is more flexible for practical application.
arXiv Detail & Related papers (2022-01-12T08:59:12Z) - Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based
Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision.
In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks.
For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC)
Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.