Related papers: Cross-Scan Mamba with Masked Training for Robust Spectral Imaging

Cross-Scan Mamba with Masked Training for Robust Spectral Imaging

URL: http://arxiv.org/abs/2408.00629v2
Date: Fri, 06 Dec 2024 23:30:10 GMT
Title: Cross-Scan Mamba with Masked Training for Robust Spectral Imaging
Authors: Wenzhe Tian, Haijin Zeng, Yin-Ping Zhao, Yongyong Chen, Zhen Wang, Xuelong Li,
Abstract summary: We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.<n>Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
Score: 51.557804095896174
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Snapshot Compressive Imaging (SCI) enables fast spectral imaging but requires effective decoding algorithms for hyperspectral image (HSI) reconstruction from compressed measurements. Current CNN-based methods are limited in modeling long-range dependencies, while Transformer-based models face high computational complexity. Although recent Mamba models outperform CNNs and Transformers in RGB tasks concerning computational efficiency or accuracy, they are not specifically optimized to fully leverage the local spatial and spectral correlations inherent in HSIs. To address this, we propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding and cross-channel interaction promotion. Besides, while current reconstruction algorithms perform increasingly well in simulation scenarios, they exhibit suboptimal performance on real data due to limited generalization capability. During the training process, the model may not capture the inherent features of the images but rather learn the parameters to mitigate specific noise and loss, which may lead to a decline in reconstruction quality when faced with real scenes. To overcome this challenge, we propose a masked training method to enhance the generalization ability of models. Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.

Related papers

Laplace-Mamba: Laplace Frequency Prior-Guided Mamba-CNN Fusion Network for Image Dehazing [25.05616740190157]
We introduce Laplace-Mamba, a novel framework that integrates Laplace frequency prior with a hybrid Mamba-CNN architecture for efficient image dehazing.<n>Our method outperforms state-of-the-art approaches in both restoration quality and efficiency.
arXiv Detail & Related papers (2025-07-01T07:15:26Z)
FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z)
MambaStyle: Efficient StyleGAN Inversion for Real Image Editing with State-Space Models [60.110274007388135]
MambaStyle is an efficient single-stage encoder-based approach for GAN inversion and editing.<n>We show that MambaStyle achieves a superior balance among inversion accuracy, editing quality, and computational efficiency.
arXiv Detail & Related papers (2025-05-06T20:03:47Z)
MambaIC: State Space Models for High-Performance Learned Image Compression [53.991726013454695]
A high-performance image compression algorithm is crucial for real-time information transmission across numerous fields. Inspired by the effectiveness of state space models (SSMs) in capturing long-range dependencies, we leverage SSMs to address computational inefficiency in existing methods. We propose an enhanced image compression approach through refined context modeling, which we term MambaIC.
arXiv Detail & Related papers (2025-03-16T11:32:34Z)
Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging [40.80197280147993]
We propose a Mamba-inspired Joint Unfolding Network (MiJUN) to overcome the inherent nonlinear and ill-posed characteristics of HSI reconstruction. We introduce an accelerated unfolding network scheme, which reduces the reliance on initial optimization stages. We refine the scanning strategy with Mamba by integrating the tensor mode-$k$ unfolding into the Mamba network.
arXiv Detail & Related papers (2025-01-02T13:56:23Z)
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance [2.45239928345171]
We introduce MAL (Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance), a novel framework that enhances xLSTM's capabilities through innovative pretraining strategies. We propose a cluster-masked masking method that significantly improves local feature capture and optimize image scanning efficiency. Our universal encoder-decoder pretraining approach integrates multiple tasks, including image autoregression, depth estimation, and image segmentation, thereby enhancing the model's adaptability and robustness across diverse visual tasks.
arXiv Detail & Related papers (2024-12-14T07:58:24Z)
Coarse-Fine Spectral-Aware Deformable Convolution For Hyperspectral Image Reconstruction [15.537910100051866]
We study the inverse problem of Coded Aperture Snapshot Spectral Imaging (CASSI) We propose Coarse-Fine Spectral-Aware Deformable Convolution Network (CFSDCN) Our CFSDCN significantly outperforms previous state-of-the-art (SOTA) methods on both simulated and real HSI datasets.
arXiv Detail & Related papers (2024-06-18T15:15:12Z)
Scalable Visual State Space Model with Fractal Scanning [16.077348474371547]
State Space Models (SSMs) have emerged as efficient alternatives to Transformer models. We propose using fractal scanning curves for patch serialization. We validate our method in image classification, detection, and segmentation tasks.
arXiv Detail & Related papers (2024-05-23T12:12:11Z)
Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification [4.389334324926174]
This study introduces the innovative Mamba-in-Mamba (MiM) architecture for HSI classification, the first attempt of deploying State Space Model (SSM) in this task. MiM model includes 1) A novel centralized Mamba-Cross-Scan (MCS) mechanism for transforming images into sequence-data, 2) A Tokenized Mamba (T-Mamba) encoder, and 3) A Weighted MCS Fusion (WMF) module. Experimental results from three public HSI datasets demonstrate that our method outperforms existing baselines and state-of-the-art approaches.
arXiv Detail & Related papers (2024-05-20T13:19:02Z)
SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising [13.1240990099267]
We introduce a memory-efficient spatial-spectralamba (SSUMamba) for HSI denoising. Mamba is known for its remarkable long-range dependency modeling capabilities. SSUMamba achieves superior denoising results with lower memory consumption per batch compared to transformer-based methods.
arXiv Detail & Related papers (2024-05-02T20:44:26Z)
Physics-Inspired Degradation Models for Hyperspectral Image Fusion [61.743696362028246]
Most fusion methods solely focus on the fusion algorithm itself and overlook the degradation models. We propose physics-inspired degradation models (PIDM) to model the degradation of LR-HSI and HR-MSI. Our proposed PIDM can boost the fusion performance of existing fusion methods in practical scenarios.
arXiv Detail & Related papers (2024-02-04T09:07:28Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
Unsupervised Hyperspectral and Multispectral Images Fusion Based on the Cycle Consistency [21.233354336608205]
We propose an unsupervised HSI and MSI fusion model based on the cycle consistency, called CycFusion. The CycFusion learns the domain transformation between low spatial resolution HSI (LrHSI) and high spatial resolution MSI (HrMSI) Experiments conducted on several datasets show that our proposed model outperforms all compared unsupervised fusion methods.
arXiv Detail & Related papers (2023-07-07T06:47:15Z)
Exploring Effective Mask Sampling Modeling for Neural Image Compression [171.35596121939238]
Most existing neural image compression methods rely on side information from hyperprior or context models to eliminate spatial redundancy. Inspired by the mask sampling modeling in recent self-supervised learning methods for natural language processing and high-level vision, we propose a novel pretraining strategy for neural image compression. Our method achieves competitive performance with lower computational complexity compared to state-of-the-art image compression methods.
arXiv Detail & Related papers (2023-06-09T06:50:20Z)
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising [64.11157141177208]
We propose a spectral enhanced rectangle Transformer to model the spatial and spectral correlation in hyperspectral images. For the former, we exploit the rectangle self-attention horizontally and vertically to capture the non-local similarity in the spatial domain. For the latter, we design a spectral enhancement module that is capable of extracting global underlying low-rank property of spatial-spectral cubes to suppress noise.
arXiv Detail & Related papers (2023-04-03T09:42:13Z)
Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST) CST embedding HSI sparsity into deep learning for HSI reconstruction. In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z)
Calibrated Hyperspectral Image Reconstruction via Graph-based Self-Tuning Network [40.71031760929464]
Hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded snapshot spectral imaging (CASSI) system. Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI. This mask-specific training style will lead to a hardware miscalibration issue, which sets up barriers to deploying deep HSI models among different hardware and noisy environments. We propose a novel Graph-based Self-Tuning ( GST) network to reason uncertainties adapting to varying spatial structures of masks among
arXiv Detail & Related papers (2021-12-31T09:39:13Z)
Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction [127.20208645280438]
Hyperspectral image (HSI) reconstruction aims to recover the 3D spatial-spectral signal from a 2D measurement. Modeling the inter-spectra interactions is beneficial for HSI reconstruction. Mask-guided Spectral-wise Transformer (MST) proposes a novel framework for HSI reconstruction.
arXiv Detail & Related papers (2021-11-15T16:59:48Z)
Adaptive Gradient Balancing for UndersampledMRI Reconstruction and Image-to-Image Translation [60.663499381212425]
We enhance the image quality by using a Wasserstein Generative Adversarial Network combined with a novel Adaptive Gradient Balancing technique. In MRI, our method minimizes artifacts, while maintaining a high-quality reconstruction that produces sharper images than other techniques.
arXiv Detail & Related papers (2021-04-05T13:05:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.