3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification
- URL: http://arxiv.org/abs/2404.13252v1
- Date: Sat, 20 Apr 2024 03:39:54 GMT
- Title: 3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification
- Authors: Shyam Varahagiri, Aryaman Sinha, Shiv Ram Dubey, Satish Kumar Singh,
- Abstract summary: Vision Transformers (ViTs) have shown promising classification performance over Convolutional Neural Networks (CNNs)
ViTs excel with sequential data, but they cannot extract spectral-spatial information like CNNs.
We propose a 3D-Convolution guided Spectral-Spatial Transformer (3D-ConvSST) for HSI classification.
- Score: 12.729885732069926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, Vision Transformers (ViTs) have shown promising classification performance over Convolutional Neural Networks (CNNs) due to their self-attention mechanism. Many researchers have incorporated ViTs for Hyperspectral Image (HSI) classification. HSIs are characterised by narrow contiguous spectral bands, providing rich spectral data. Although ViTs excel with sequential data, they cannot extract spectral-spatial information like CNNs. Furthermore, to have high classification performance, there should be a strong interaction between the HSI token and the class (CLS) token. To solve these issues, we propose a 3D-Convolution guided Spectral-Spatial Transformer (3D-ConvSST) for HSI classification that utilizes a 3D-Convolution Guided Residual Module (CGRM) in-between encoders to "fuse" the local spatial and spectral information and to enhance the feature propagation. Furthermore, we forego the class token and instead apply Global Average Pooling, which effectively encodes more discriminative and pertinent high-level features for classification. Extensive experiments have been conducted on three public HSI datasets to show the superiority of the proposed model over state-of-the-art traditional, convolutional, and Transformer models. The code is available at https://github.com/ShyamVarahagiri/3D-ConvSST.
Related papers
- Superpixel Graph Contrastive Clustering with Semantic-Invariant
Augmentations for Hyperspectral Images [64.72242126879503]
Hyperspectral images (HSI) clustering is an important but challenging task.
We first use 3-D and 2-D hybrid convolutional neural networks to extract the high-order spatial and spectral features of HSI.
We then design a superpixel graph contrastive clustering model to learn discriminative superpixel representations.
arXiv Detail & Related papers (2024-03-04T07:40:55Z) - Hybrid Spectral Denoising Transformer with Guided Attention [34.34075175179669]
We present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising.
Our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead.
arXiv Detail & Related papers (2023-03-16T02:24:31Z) - RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in
Autonomous Driving [80.14669385741202]
Vision transformers (ViTs) have achieved state-of-the-art results in many image-based benchmarks.
ViTs are notoriously hard to train and require a lot of training data to learn powerful representations.
We show that our method, called RangeViT, outperforms existing projection-based methods on nuScenes and Semantic KITTI.
arXiv Detail & Related papers (2023-01-24T18:50:48Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - Learning A 3D-CNN and Transformer Prior for Hyperspectral Image
Super-Resolution [80.93870349019332]
We propose a novel HSISR method that uses Transformer instead of CNN to learn the prior of HSIs.
Specifically, we first use the gradient algorithm to solve the HSISR model, and then use an unfolding network to simulate the iterative solution processes.
arXiv Detail & Related papers (2021-11-27T15:38:57Z) - SpectralFormer: Rethinking Hyperspectral Image Classification with
Transformers [91.09957836250209]
Hyperspectral (HS) images are characterized by approximately contiguous spectral information.
CNNs have been proven to be a powerful feature extractor in HS image classification.
We propose a novel backbone network called ulSpectralFormer for HS image classification.
arXiv Detail & Related papers (2021-07-07T02:59:21Z) - Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer
Learning [67.40866334083941]
We propose an end-to-end 3-D lightweight convolutional neural network (CNN) for limited samples-based HSI classification.
Compared with conventional 3-D-CNN models, the proposed 3-D-LWNet has a deeper network structure, less parameters, and lower computation cost.
Our model achieves competitive performance for HSI classification compared to several state-of-the-art methods.
arXiv Detail & Related papers (2020-12-07T03:44:35Z) - Hyperspectral Image Classification with Spatial Consistence Using Fully
Convolutional Spatial Propagation Network [9.583523548244683]
Deep convolutional neural networks (CNNs) have shown impressive ability to represent hyperspectral images (HSIs)
We propose a novel end-to-end, pixels-to-pixels fully convolutional spatial propagation network (FCSPN) for HSI classification.
FCSPN consists of a 3D fully convolution network (3D-FCN) and a convolutional spatial propagation network (CSPN)
arXiv Detail & Related papers (2020-08-04T09:05:52Z) - A Fast 3D CNN for Hyperspectral Image Classification [0.456877715768796]
Hyperspectral imaging (HSI) has been extensively utilized for a number of real-world applications.
A 2D Convolutional Neural Network (CNN) is a viable approach whereby HSIC highly depends on both Spectral-Spatial information.
This work proposed a 3D CNN model that utilizes both spatial-spectral feature maps to attain good performance.
arXiv Detail & Related papers (2020-04-29T12:57:36Z) - Hyperspectral Classification Based on 3D Asymmetric Inception Network
with Data Fusion Transfer Learning [36.05574127972413]
We first deliver a 3D asymmetric inception network, AINet, to overcome the overfitting problem.
With the emphasis on spectral signatures over spatial contexts of HSI data, AINet can convey and classify the features effectively.
arXiv Detail & Related papers (2020-02-11T06:37:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.