SpecSwin3D: Generating Hyperspectral Imagery from Multispectral Data via Transformer Networks
- URL: http://arxiv.org/abs/2509.06122v1
- Date: Sun, 07 Sep 2025 16:18:31 GMT
- Title: SpecSwin3D: Generating Hyperspectral Imagery from Multispectral Data via Transformer Networks
- Authors: Tang Sui, Songxi Yang, Qunying Huang,
- Abstract summary: We propose SpecSwin3D, a transformer-based model that generates hyperspectral imagery from multispectral inputs.<n>Specifically, SpecSwin3D takes five multispectral bands as input and reconstructs 224 hyperspectral bands at the same spatial resolution.<n>We show that our model achieves a PSNR of 35.82 dB, SAM of 2.40deg, and SSIM of 0.96, outperforming the baseline MHF-Net by +5.6 dB in PSNR and reducing ERGAS by more than half.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multispectral and hyperspectral imagery are widely used in agriculture, environmental monitoring, and urban planning due to their complementary spatial and spectral characteristics. A fundamental trade-off persists: multispectral imagery offers high spatial but limited spectral resolution, while hyperspectral imagery provides rich spectra at lower spatial resolution. Prior hyperspectral generation approaches (e.g., pan-sharpening variants, matrix factorization, CNNs) often struggle to jointly preserve spatial detail and spectral fidelity. In response, we propose SpecSwin3D, a transformer-based model that generates hyperspectral imagery from multispectral inputs while preserving both spatial and spectral quality. Specifically, SpecSwin3D takes five multispectral bands as input and reconstructs 224 hyperspectral bands at the same spatial resolution. In addition, we observe that reconstruction errors grow for hyperspectral bands spectrally distant from the input bands. To address this, we introduce a cascade training strategy that progressively expands the spectral range to stabilize learning and improve fidelity. Moreover, we design an optimized band sequence that strategically repeats and orders the five selected multispectral bands to better capture pairwise relations within a 3D shifted-window transformer framework. Quantitatively, our model achieves a PSNR of 35.82 dB, SAM of 2.40{\deg}, and SSIM of 0.96, outperforming the baseline MHF-Net by +5.6 dB in PSNR and reducing ERGAS by more than half. Beyond reconstruction, we further demonstrate the practical value of SpecSwin3D on two downstream tasks, including land use classification and burnt area segmentation.
Related papers
- Spatial-Spectral Binarized Neural Network for Panchromatic and Multi-spectral Images Fusion [25.15016853820625]
Deep learning models have achieved excellent performance, but they often come with high computational complexity.<n>In this paper, we explore the feasibility of applying the binary neural network (BNN) to pan-sharpening.<n>A series of S2B-Conv form a brand-new binary network for pan-sharpening, dubbed as S2BNet.
arXiv Detail & Related papers (2025-09-27T14:10:51Z) - Hybrid Deep Learning for Hyperspectral Single Image Super-Resolution [9.467214671383875]
We introduce Spectral-Spatial Unmixing Fusion (SSUF), a novel module that can be seamlessly integrated into standard 2D convolutional architectures.<n>The SSUF combines spectral unmixing with spectral--spatial feature extraction and guides a ResNet-based convolutional neural network for improved reconstruction.<n> Experiments on three public remote sensing hyperspectral datasets demonstrate that the proposed hybrid deep learning model achieves competitive performance.
arXiv Detail & Related papers (2025-09-26T08:28:07Z) - HyperspectralMAE: The Hyperspectral Imagery Classification Model using Fourier-Encoded Dual-Branch Masked Autoencoder [0.04332259966721321]
Hyperspectral imagery provides rich spectral detail but poses unique challenges because of its high dimensionality in both spatial and spectral domains.<n>We propose textitHyperspectralMAE, a Transformer-based model for hyperspectral data that employs a textitdual masking strategy.<n>HyperspectralMAE achieves state-of-the-art transfer-learning accuracy on Indian Pines, confirming that masked dual-dimensional pre-training yields robust spectral-spatial representations.
arXiv Detail & Related papers (2025-05-09T01:16:42Z) - Hyperspectral Vision Transformers for Greenhouse Gas Estimations from Space [6.559887989252469]
This study proposes a spectral transformer model that synthesizes hyperspectral data from multispectral inputs.<n>The model is pre-trained via a band-wise masked auto-encoder and subsequently fine-tuned on aligned multispectral-hyperspectral image pairs.<n>The resulting synthetic hyperspectral data retain the spatial and temporal benefits of multispectral imagery and improve GHG prediction accuracy.
arXiv Detail & Related papers (2025-04-23T16:19:42Z) - CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs [65.80187860906115]
We propose a novel approach to improve NeRF's performance with sparse inputs.
We first adopt a voxel-based ray sampling strategy to ensure that the sampled rays intersect with a certain voxel in 3D space.
We then randomly sample additional points within the voxel and apply a Transformer to infer the properties of other points on each ray, which are then incorporated into the volume rendering.
arXiv Detail & Related papers (2024-03-25T15:56:17Z) - Hyperspectral Neural Radiance Fields [11.485829401765521]
We propose a hyperspectral 3D reconstruction using Neural Radiance Fields (NeRFs)
NeRFs have seen widespread success in creating high quality volumetric 3D representations of scenes captured by a variety of camera models.
We show that our hyperspectral NeRF approach enables creating fast, accurate volumetric 3D hyperspectral scenes.
arXiv Detail & Related papers (2024-03-21T21:18:08Z) - Cross-Scope Spatial-Spectral Information Aggregation for Hyperspectral
Image Super-Resolution [47.12985199570964]
We propose a novel cross-scope spatial-spectral Transformer (CST) to investigate long-range spatial and spectral similarities for single hyperspectral image super-resolution.
Specifically, we devise cross-attention mechanisms in spatial and spectral dimensions to comprehensively model the long-range spatial-spectral characteristics.
Experiments over three hyperspectral datasets demonstrate that the proposed CST is superior to other state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2023-11-29T03:38:56Z) - SpectralGPT: Spectral Remote Sensing Foundation Model [60.023956954916414]
A universal RS foundation model, named SpectralGPT, is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT)
Compared to existing foundation models, SpectralGPT accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS big data.
Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS big data applications within the field of geoscience.
arXiv Detail & Related papers (2023-11-13T07:09:30Z) - S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction [59.39343894089959]
A snapshot compressive imager (CASSI) with Transformer reconstruction backend remarks high-fidelity sensing performance.<n> dominant spatial and spectral attention designs show limitations in hyperspectral modeling.<n>We propose a spatial-spectral (S2-) Transformer implemented by a paralleled attention design and a mask-aware learning strategy.
arXiv Detail & Related papers (2022-09-24T19:26:46Z) - Spectral Splitting and Aggregation Network for Hyperspectral Face
Super-Resolution [82.59267937569213]
High-resolution (HR) hyperspectral face image plays an important role in face related computer vision tasks under uncontrolled conditions.
In this paper, we investigate how to adapt the deep learning techniques to hyperspectral face image super-resolution.
We present a spectral splitting and aggregation network (SSANet) for HFSR with limited training samples.
arXiv Detail & Related papers (2021-08-31T02:13:00Z) - Hyperspectral Image Super-resolution via Deep Progressive Zero-centric
Residual Learning [62.52242684874278]
Cross-modality distribution of spatial and spectral information makes the problem challenging.
We propose a novel textitlightweight deep neural network-based framework, namely PZRes-Net.
Our framework learns a high resolution and textitzero-centric residual image, which contains high-frequency spatial details of the scene.
arXiv Detail & Related papers (2020-06-18T06:32:11Z) - Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral
Imagery [79.69449412334188]
In this paper, we investigate how to adapt state-of-the-art residual learning based single gray/RGB image super-resolution approaches.
We introduce a spatial-spectral prior network (SSPN) to fully exploit the spatial information and the correlation between the spectra of the hyperspectral data.
Experimental results on some hyperspectral images demonstrate that the proposed SSPSR method enhances the details of the recovered high-resolution hyperspectral images.
arXiv Detail & Related papers (2020-05-18T14:25:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.