SpectralFormer: Rethinking Hyperspectral Image Classification with
Transformers
- URL: http://arxiv.org/abs/2107.02988v1
- Date: Wed, 7 Jul 2021 02:59:21 GMT
- Title: SpectralFormer: Rethinking Hyperspectral Image Classification with
Transformers
- Authors: Danfeng Hong and Zhu Han and Jing Yao and Lianru Gao and Bing Zhang
and Antonio Plaza and Jocelyn Chanussot
- Abstract summary: Hyperspectral (HS) images are characterized by approximately contiguous spectral information.
CNNs have been proven to be a powerful feature extractor in HS image classification.
We propose a novel backbone network called ulSpectralFormer for HS image classification.
- Score: 91.09957836250209
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperspectral (HS) images are characterized by approximately contiguous
spectral information, enabling the fine identification of materials by
capturing subtle spectral discrepancies. Owing to their excellent locally
contextual modeling ability, convolutional neural networks (CNNs) have been
proven to be a powerful feature extractor in HS image classification. However,
CNNs fail to mine and represent the sequence attributes of spectral signatures
well due to the limitations of their inherent network backbone. To solve this
issue, we rethink HS image classification from a sequential perspective with
transformers, and propose a novel backbone network called \ul{SpectralFormer}.
Beyond band-wise representations in classic transformers, SpectralFormer is
capable of learning spectrally local sequence information from neighboring
bands of HS images, yielding group-wise spectral embeddings. More
significantly, to reduce the possibility of losing valuable information in the
layer-wise propagation process, we devise a cross-layer skip connection to
convey memory-like components from shallow to deep layers by adaptively
learning to fuse "soft" residuals across layers. It is worth noting that the
proposed SpectralFormer is a highly flexible backbone network, which can be
applicable to both pixel- and patch-wise inputs. We evaluate the classification
performance of the proposed SpectralFormer on three HS datasets by conducting
extensive experiments, showing the superiority over classic transformers and
achieving a significant improvement in comparison with state-of-the-art
backbone networks. The codes of this work will be available at
\url{https://sites.google.com/view/danfeng-hong} for the sake of
reproducibility.
Related papers
- Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken [15.426635239291729]
We introduce the Dual-stage Spectral Supertoken (DSTC), inspired by superpixel concepts.
DSTC employs spectrum-derivative-based pixel clustering to group pixels with similar spectral characteristics into spectral supertokens.
We also propose a class-proportion-based soft label, which adaptively assigns weights to different categories based on their prevalence.
arXiv Detail & Related papers (2024-07-10T01:58:30Z) - 3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification [12.729885732069926]
Vision Transformers (ViTs) have shown promising classification performance over Convolutional Neural Networks (CNNs)
ViTs excel with sequential data, but they cannot extract spectral-spatial information like CNNs.
We propose a 3D-Convolution guided Spectral-Spatial Transformer (3D-ConvSST) for HSI classification.
arXiv Detail & Related papers (2024-04-20T03:39:54Z) - DiffSpectralNet : Unveiling the Potential of Diffusion Models for
Hyperspectral Image Classification [6.521187080027966]
We propose a new network called DiffSpectralNet, which combines diffusion and transformer techniques.
First, we use an unsupervised learning framework based on the diffusion model to extract both high-level and low-level spectral-spatial features.
The diffusion method is capable of extracting diverse and meaningful spectral-spatial features, leading to improvement in HSI classification.
arXiv Detail & Related papers (2023-10-29T15:26:37Z) - Dynamic Spectrum Mixer for Visual Recognition [17.180863898764194]
We propose a content-adaptive yet computationally efficient structure, dubbed Dynamic Spectrum Mixer (DSM)
DSM represents token interactions in the frequency domain by employing the Cosine Transform.
It can learn long-term spatial dependencies with log-linear complexity.
arXiv Detail & Related papers (2023-09-13T04:51:15Z) - DCN-T: Dual Context Network with Transformer for Hyperspectral Image
Classification [109.09061514799413]
Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions.
We propose a tri-spectral image generation pipeline that transforms HSI into high-quality tri-spectral images.
Our proposed method outperforms state-of-the-art methods for HSI classification.
arXiv Detail & Related papers (2023-04-19T18:32:52Z) - A heterogeneous group CNN for image super-resolution [127.2132400582117]
Convolutional neural networks (CNNs) have obtained remarkable performance via deep architectures.
We present a heterogeneous group SR CNN (HGSRCNN) via leveraging structure information of different types to obtain a high-quality image.
arXiv Detail & Related papers (2022-09-26T04:14:59Z) - On Improving the Performance of Glitch Classification for Gravitational
Wave Detection by using Generative Adversarial Networks [0.0]
We propose a framework to improve the classification performance by using Generative Adversarial Networks (GANs)
We show that the proposed method can provide an alternative to transfer learning for the classification of spectrograms using deep networks.
arXiv Detail & Related papers (2022-07-08T16:35:17Z) - Less is More: Pay Less Attention in Vision Transformers [61.05787583247392]
Less attention vIsion Transformer builds upon the fact that convolutions, fully-connected layers, and self-attentions have almost equivalent mathematical expressions for processing image patch sequences.
The proposed LIT achieves promising performance on image recognition tasks, including image classification, object detection and instance segmentation.
arXiv Detail & Related papers (2021-05-29T05:26:07Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Hyperspectral Image Super-resolution via Deep Progressive Zero-centric
Residual Learning [62.52242684874278]
Cross-modality distribution of spatial and spectral information makes the problem challenging.
We propose a novel textitlightweight deep neural network-based framework, namely PZRes-Net.
Our framework learns a high resolution and textitzero-centric residual image, which contains high-frequency spatial details of the scene.
arXiv Detail & Related papers (2020-06-18T06:32:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.