MCGA: Mixture of Codebooks Hyperspectral Reconstruction via Grayscale-Aware Attention
- URL: http://arxiv.org/abs/2507.09885v1
- Date: Mon, 14 Jul 2025 03:46:06 GMT
- Title: MCGA: Mixture of Codebooks Hyperspectral Reconstruction via Grayscale-Aware Attention
- Authors: Zhanjiang Yang, Lijun Sun, Jiawei Dong, Xiaoxin An, Yang Liu, Meng Li,
- Abstract summary: We propose a two-stage approach, MCGA, which first learns spectral patterns before estimating the mapping.<n>In the first stage, a multi-scale VQ-VAE learns representations from heterogeneous HSI datasets, extracting a Mixture of Codebooks (MoC)<n>In the second stage, the RGB-to-HSI mapping is refined by querying features from the MoC to replace latent HSI representations.
- Score: 19.156831096843284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing hyperspectral images (HSI) from RGB images is a cost-effective solution for various vision-based applications. However, most existing learning-based hyperspectral reconstruction methods directly learn the RGB-to-HSI mapping using complex attention mechanisms, neglecting the inherent challenge of transitioning from low-dimensional to high-dimensional information. To address this limitation, we propose a two-stage approach, MCGA, which first learns spectral patterns before estimating the mapping. In the first stage, a multi-scale VQ-VAE learns representations from heterogeneous HSI datasets, extracting a Mixture of Codebooks (MoC). In the second stage, the RGB-to-HSI mapping is refined by querying features from the MoC to replace latent HSI representations, incorporating prior knowledge rather than forcing a direct high-dimensional transformation. To further enhance reconstruction quality, we introduce Grayscale-Aware Attention and Quantized Self-Attention, which adaptively adjust feature map intensities to meet hyperspectral reconstruction requirements. This physically motivated attention mechanism ensures lightweight and efficient HSI recovery. Moreover, we propose an entropy-based Test-Time Adaptation strategy to improve robustness in real-world scenarios. Extensive experiments demonstrate that our method, MCGA, achieves state-of-the-art performance. The code and models will be released at https://github.com/Fibonaccirabbit/MCGA
Related papers
- Compressive Imaging Reconstruction via Tensor Decomposed Multi-Resolution Grid Encoding [50.54887630778593]
Compressive imaging (CI) reconstruction aims to recover high-dimensional images from low-dimensional measurements compressed.<n>Existing unsupervised representations may struggle to achieve a desired balance between representation ability and efficiency.<n>We propose Decomposed multi-resolution Grid encoding (GridTD), an unsupervised continuous representation framework for CI reconstruction.
arXiv Detail & Related papers (2025-07-10T12:36:20Z) - Physical Degradation Model-Guided Interferometric Hyperspectral Reconstruction with Unfolding Transformer [10.761506243784744]
Interferometric Hyperspectral Imaging (IHI) is a critical technique for large-scale remote sensing tasks.<n>IHI is susceptible to complex errors arising from imaging steps, and its quality is limited by existing signal processing-based reconstruction algorithms.<n>We propose a novel IHI reconstruction pipeline to address two key challenges: the lack of training datasets and the difficulty in eliminating IHI-specific degradation components.
arXiv Detail & Related papers (2025-06-27T03:36:00Z) - Mixed-granularity Implicit Representation for Continuous Hyperspectral Compressive Reconstruction [16.975538181162616]
This study introduces a novel method using implicit neural representation for continuous hyperspectral image reconstruction.<n>By leveraging implicit neural representations, the MGIR framework enables reconstruction at any desired spatial-spectral resolution.
arXiv Detail & Related papers (2025-03-17T03:37:42Z) - Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images [64.80875911446937]
We propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.<n>For the correlation of local spectrum, we introduce the Group-wise Spectral Correlation Modeling (GrSCM) module.<n>For the continuity of global spectrum, we design the Neighborhood-wise Spectral Continuity Modeling (NeSCM) module.
arXiv Detail & Related papers (2025-01-02T15:14:40Z) - Super-Resolution for Remote Sensing Imagery via the Coupling of a Variational Model and Deep Learning [20.697932997351813]
gradient-guided multi-frame super-resolution (MFSR) framework for remote sensing imagery reconstruction.<n>We propose a novel gradient-guided multi-frame super-resolution (MFSR) framework for remote sensing imagery reconstruction.
arXiv Detail & Related papers (2024-12-13T04:19:48Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Degradation-Aware Unfolding Half-Shuffle Transformer for Spectral
Compressive Imaging [142.11622043078867]
We propose a principled Degradation-Aware Unfolding Framework (DAUF) that estimates parameters from the compressed image and physical mask, and then uses these parameters to control each iteration.
By plugging HST into DAUF, we establish the first Transformer-based deep unfolding method, Degradation-Aware Unfolding Half-Shuffle Transformer (DAUHST) for HSI reconstruction.
arXiv Detail & Related papers (2022-05-20T11:37:44Z) - HDNet: High-resolution Dual-domain Learning for Spectral Compressive
Imaging [138.04956118993934]
We propose a high-resolution dual-domain learning network (HDNet) for HSI reconstruction.
On the one hand, the proposed HR spatial-spectral attention module with its efficient feature fusion provides continuous and fine pixel-level features.
On the other hand, frequency domain learning (FDL) is introduced for HSI reconstruction to narrow the frequency domain discrepancy.
arXiv Detail & Related papers (2022-03-04T06:37:45Z) - Spectral Compressive Imaging Reconstruction Using Convolution and
Contextual Transformer [6.929652454131988]
We propose a hybrid network module, namely CCoT (Contextual Transformer) block, which can acquire the inductive bias ability of transformer simultaneously.
We integrate the proposed CCoT block into deep unfolding framework based on the generalized alternating projection algorithm, and further propose the GAP-CT network.
arXiv Detail & Related papers (2022-01-15T06:30:03Z) - Calibrated Hyperspectral Image Reconstruction via Graph-based
Self-Tuning Network [40.71031760929464]
Hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded snapshot spectral imaging (CASSI) system.
Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI.
This mask-specific training style will lead to a hardware miscalibration issue, which sets up barriers to deploying deep HSI models among different hardware and noisy environments.
We propose a novel Graph-based Self-Tuning ( GST) network to reason uncertainties adapting to varying spatial structures of masks among
arXiv Detail & Related papers (2021-12-31T09:39:13Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.