RGB Pre-Training Enhanced Unobservable Feature Latent Diffusion Model for Spectral Reconstruction
- URL: http://arxiv.org/abs/2507.12967v1
- Date: Thu, 17 Jul 2025 10:07:32 GMT
- Title: RGB Pre-Training Enhanced Unobservable Feature Latent Diffusion Model for Spectral Reconstruction
- Authors: Keli Deng, Jie Nie, Yuntao Qian,
- Abstract summary: We propose a two-stage pipeline consisting of spectral structure representation learning and spectral-spatial joint distribution learning.<n>In the first stage, a spectral unobservable feature autoencoder (SpeUAE) is trained to extract and compress the unobservable feature into a 3D manifold aligned with RGB space.<n>The ULDM is then acquired to model the distribution of the coded unobservable feature with guidance from the corresponding RGB images.
- Score: 16.54284634377436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spectral reconstruction (SR) is a crucial problem in image processing that requires reconstructing hyperspectral images (HSIs) from the corresponding RGB images. A key difficulty in SR is estimating the unobservable feature, which encapsulates significant spectral information not captured by RGB imaging sensors. The solution lies in effectively constructing the spectral-spatial joint distribution conditioned on the RGB image to complement the unobservable feature. Since HSIs share a similar spatial structure with the corresponding RGB images, it is rational to capitalize on the rich spatial knowledge in RGB pre-trained models for spectral-spatial joint distribution learning. To this end, we extend the RGB pre-trained latent diffusion model (RGB-LDM) to an unobservable feature LDM (ULDM) for SR. As the RGB-LDM and its corresponding spatial autoencoder (SpaAE) already excel in spatial knowledge, the ULDM can focus on modeling spectral structure. Moreover, separating the unobservable feature from the HSI reduces the redundant spectral information and empowers the ULDM to learn the joint distribution in a compact latent space. Specifically, we propose a two-stage pipeline consisting of spectral structure representation learning and spectral-spatial joint distribution learning to transform the RGB-LDM into the ULDM. In the first stage, a spectral unobservable feature autoencoder (SpeUAE) is trained to extract and compress the unobservable feature into a 3D manifold aligned with RGB space. In the second stage, the spectral and spatial structures are sequentially encoded by the SpeUAE and the SpaAE, respectively. The ULDM is then acquired to model the distribution of the coded unobservable feature with guidance from the corresponding RGB images. Experimental results on SR and downstream relighting tasks demonstrate that our proposed method achieves state-of-the-art performance.
Related papers
- End-to-End RGB-IR Joint Image Compression With Channel-wise Cross-modality Entropy Model [39.52468600966148]
As the number of modalities increases, the required data storage and transmission costs also double.<n>This work proposes a joint compression framework for RGB-IR image pair.
arXiv Detail & Related papers (2025-06-27T02:04:21Z) - CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis [75.25966323298003]
Spectral imaging offers promising applications across diverse domains, including medicine and urban scene understanding.<n> variability in channel dimensionality and captured wavelengths among spectral cameras impede the development of AI-driven methodologies.<n>We introduce $textbfCARL$, a model for $textbfC$amera-$textbfA$gnostic $textbfR$esupervised $textbfL$ across RGB, multispectral, and hyperspectral imaging modalities.
arXiv Detail & Related papers (2025-04-27T13:06:40Z) - Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images [64.80875911446937]
We propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.<n>For the correlation of local spectrum, we introduce the Group-wise Spectral Correlation Modeling (GrSCM) module.<n>For the continuity of global spectrum, we design the Neighborhood-wise Spectral Continuity Modeling (NeSCM) module.
arXiv Detail & Related papers (2025-01-02T15:14:40Z) - Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution [54.293362972473595]
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts.
Current approaches to address SR tasks are either dedicated to extracting RGB image features or assuming similar degradation patterns.
We propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity.
arXiv Detail & Related papers (2024-11-19T14:24:03Z) - EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution [15.459253235077375]
Single hyperspectral image super-resolution (single-HSI-SR) aims to improve the resolution of a single input low-resolution HSI.<n>Due to the bottleneck of data scarcity, the development of single-HSI-SR lags far behind that of RGB natural images.
arXiv Detail & Related papers (2024-09-06T06:46:01Z) - Residual Spatial Fusion Network for RGB-Thermal Semantic Segmentation [19.41334573257174]
Traditional methods mostly use RGB images which are heavily affected by lighting conditions, eg, darkness.
Recent studies show thermal images are robust to the night scenario as a compensating modality for segmentation.
This work proposes a Residual Spatial Fusion Network (RSFNet) for RGB-T semantic segmentation.
arXiv Detail & Related papers (2023-06-17T14:28:08Z) - Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - Continuous Spectral Reconstruction from RGB Images via Implicit Neural
Representation [43.622087181097164]
Existing methods for spectral reconstruction usually learn a discrete mapping from RGB images to a number of spectral bands.
We propose Neural Spectral Reconstruction (NeSR) to lift this limitation, by introducing a novel continuous spectral representation.
NeSR extends the flexibility of spectral reconstruction by enabling an arbitrary number of spectral bands as the target output.
arXiv Detail & Related papers (2021-12-24T09:08:23Z) - Tuning IR-cut Filter for Illumination-aware Spectral Reconstruction from
RGB [84.1657998542458]
It has been proven that the reconstruction accuracy relies heavily on the spectral response of the RGB camera in use.
This paper explores the filter-array based color imaging mechanism of existing RGB cameras, and proposes to design the IR-cut filter properly for improved spectral recovery.
arXiv Detail & Related papers (2021-03-26T19:42:21Z) - Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral
Imagery [79.69449412334188]
In this paper, we investigate how to adapt state-of-the-art residual learning based single gray/RGB image super-resolution approaches.
We introduce a spatial-spectral prior network (SSPN) to fully exploit the spatial information and the correlation between the spectra of the hyperspectral data.
Experimental results on some hyperspectral images demonstrate that the proposed SSPSR method enhances the details of the recovered high-resolution hyperspectral images.
arXiv Detail & Related papers (2020-05-18T14:25:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.