Boosting Hyperspectral Image Classification with Gate-Shift-Fuse Mechanisms in a Novel CNN-Transformer Approach
- URL: http://arxiv.org/abs/2406.14120v3
- Date: Wed, 02 Oct 2024 19:00:18 GMT
- Title: Boosting Hyperspectral Image Classification with Gate-Shift-Fuse Mechanisms in a Novel CNN-Transformer Approach
- Authors: Mohamed Fadhlallah Guerri, Cosimo Distante, Paolo Spagnolo, Fares Bougourzi, Abdelmalik Taleb-Ahmed,
- Abstract summary: This paper introduces an HSI classification model that includes two convolutional blocks, a Gate-Shift-Fuse (GSF) block and a transformer block.
The GSF block is designed to strengthen the extraction of local and global spatial-spectral features.
An effective attention mechanism module is also proposed to enhance the extraction of information from HSI cubes.
- Score: 8.982950112225264
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: During the process of classifying Hyperspectral Image (HSI), every pixel sample is categorized under a land-cover type. CNN-based techniques for HSI classification have notably advanced the field by their adept feature representation capabilities. However, acquiring deep features remains a challenge for these CNN-based methods. In contrast, transformer models are adept at extracting high-level semantic features, offering a complementary strength. This paper's main contribution is the introduction of an HSI classification model that includes two convolutional blocks, a Gate-Shift-Fuse (GSF) block and a transformer block. This model leverages the strengths of CNNs in local feature extraction and transformers in long-range context modelling. The GSF block is designed to strengthen the extraction of local and global spatial-spectral features. An effective attention mechanism module is also proposed to enhance the extraction of information from HSI cubes. The proposed method is evaluated on four well-known datasets (the Indian Pines, Pavia University, WHU-WHU-Hi-LongKou and WHU-Hi-HanChuan), demonstrating that the proposed framework achieves superior results compared to other models.
Related papers
- FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system.
We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain.
To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB)
We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z) - Dual Selective Fusion Transformer Network for Hyperspectral Image Classification [34.7051033596479]
Transformer has achieved satisfactory results in the field of hyperspectral image (HSI) classification.
Existing Transformer models face two key challenges when dealing with HSI scenes characterized by diverse land cover types and rich spectral information.
We propose a novel Dual Selective Fusion Transformer Network (DSFormer) for HSI classification.
arXiv Detail & Related papers (2024-10-04T06:05:26Z) - Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification [27.04370747400184]
This paper introduces the Spatial-Spectral Morphological Mamba (MorpMamba) model in which, a token generation module first converts the hyperspectral image patch into spatial-spectral tokens.
These tokens are processed by morphological operations, which compute structural and shape information using depthwise separable convolutional operations.
Experiments on widely used HSI datasets demonstrate that the MorpMamba model outperforms (parametric efficiency) both CNN and Transformer models.
arXiv Detail & Related papers (2024-08-02T16:28:51Z) - Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local Enhancement [51.557804095896174]
We introduce a State Space Model with Across-Scanning and Local Enhancement, named ASLE-SSM, that employs a Spatial-Spectral SSM for global-local balanced context encoding and cross-channel interaction promoting.
Experimental results illustrate ASLE-SSM's superiority over existing state-of-the-art methods, with an inference speed 2.4 times faster than Transformer-based MST and saving 0.12 (M) of parameters.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification [3.821081081400729]
Current convolutional neural networks (CNNs) focus on local features in hyperspectral data.
Transformer framework excels at extracting global features from hyperspectral imagery.
This research introduces the Convolutional Meet Transformer Network (CMTNet)
arXiv Detail & Related papers (2024-06-20T07:56:51Z) - HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification [16.742768644585684]
HSIMamba is a novel framework that uses bidirectional reversed convolutional neural network pathways to extract spectral features more efficiently.
Our approach combines the operational efficiency of CNNs with the dynamic feature extraction capability of attention mechanisms found in Transformers.
This approach improves classification accuracy beyond current benchmarks and addresses computational inefficiencies encountered with advanced models like Transformers.
arXiv Detail & Related papers (2024-03-30T07:27:36Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - Superpixel Graph Contrastive Clustering with Semantic-Invariant
Augmentations for Hyperspectral Images [64.72242126879503]
Hyperspectral images (HSI) clustering is an important but challenging task.
We first use 3-D and 2-D hybrid convolutional neural networks to extract the high-order spatial and spectral features of HSI.
We then design a superpixel graph contrastive clustering model to learn discriminative superpixel representations.
arXiv Detail & Related papers (2024-03-04T07:40:55Z) - DiffSpectralNet : Unveiling the Potential of Diffusion Models for
Hyperspectral Image Classification [6.521187080027966]
We propose a new network called DiffSpectralNet, which combines diffusion and transformer techniques.
First, we use an unsupervised learning framework based on the diffusion model to extract both high-level and low-level spectral-spatial features.
The diffusion method is capable of extracting diverse and meaningful spectral-spatial features, leading to improvement in HSI classification.
arXiv Detail & Related papers (2023-10-29T15:26:37Z) - A heterogeneous group CNN for image super-resolution [127.2132400582117]
Convolutional neural networks (CNNs) have obtained remarkable performance via deep architectures.
We present a heterogeneous group SR CNN (HGSRCNN) via leveraging structure information of different types to obtain a high-quality image.
arXiv Detail & Related papers (2022-09-26T04:14:59Z) - Multimodal Fusion Transformer for Remote Sensing Image Classification [35.57881383390397]
Vision transformers (ViTs) have been trending in image classification tasks due to their promising performance when compared to convolutional neural networks (CNNs)
To achieve satisfactory performance, close to that of CNNs, transformers need fewer parameters.
We introduce a new multimodal fusion transformer (MFT) network which comprises a multihead cross patch attention (mCrossPA) for HSI land-cover classification.
arXiv Detail & Related papers (2022-03-31T11:18:41Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.