Related papers: Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification

Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification

URL: http://arxiv.org/abs/2405.01095v1
Date: Thu, 2 May 2024 08:49:01 GMT
Title: Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification
Authors: Muhammad Ahmad, Manuel Mazzara, Salvatore Distifano,
Abstract summary: 3D Swin Transformer (3D-ST) excels in capturing intricate spatial relationships within images. SST specializes in modeling long-range dependencies through self-attention mechanisms. This paper introduces an attentional fusion of these two transformers to significantly enhance the classification performance of Hyperspectral Images (HSIs)
Score: 2.1223532600703385
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D Swin Transformer (3D-ST) known for its hierarchical attention and window-based processing, excels in capturing intricate spatial relationships within images. Spatial-spectral Transformer (SST), meanwhile, specializes in modeling long-range dependencies through self-attention mechanisms. Therefore, this paper introduces a novel method: an attentional fusion of these two transformers to significantly enhance the classification performance of Hyperspectral Images (HSIs). What sets this approach apart is its emphasis on the integration of attentional mechanisms from both architectures. This integration not only refines the modeling of spatial and spectral information but also contributes to achieving more precise and accurate classification results. The experimentation and evaluation of benchmark HSI datasets underscore the importance of employing disjoint training, validation, and test samples. The results demonstrate the effectiveness of the fusion approach, showcasing its superiority over traditional methods and individual transformers. Incorporating disjoint samples enhances the robustness and reliability of the proposed methodology, emphasizing its potential for advancing hyperspectral image classification.

Related papers

Transformers Meet Hyperspectral Imaging: A Comprehensive Study of Models, Challenges and Open Problems [0.0]
We review more than 300 papers published up to 2025 and present the first end-to-end survey dedicated to Transformer-based HSI classification.<n>The study categorizes every stage of a typical pipeline-pre-processing, patch or pixel tokenization, positional encoding, spatial-spectral feature extraction, multi-head self-attention variants, skip connections, and loss design.<n>We outline a research agenda prioritizing valuable public data sets, lightweight on-edge models, illumination and sensor shifts, and intrinsically interpretable attention mechanisms.
arXiv Detail & Related papers (2025-06-10T09:04:30Z)
DiffFormer: a Differential Spatial-Spectral Transformer for Hyperspectral Image Classification [3.271106943956333]
Hyperspectral image classification (HSIC) has gained significant attention because of its potential in analyzing high-dimensional data with rich spectral and spatial information. We propose the Differential Spatial-Spectral Transformer (DiffFormer) to address the inherent challenges of HSIC, such as spectral redundancy and spatial discontinuity. Experiments on benchmark hyperspectral datasets demonstrate the superiority of DiffFormer in terms of classification accuracy, computational efficiency, and generalizability.
arXiv Detail & Related papers (2024-12-23T07:21:41Z)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR) Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z)
Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection [106.39544368711427]
We study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods. We present a novel forgery-aware adaptive transformer approach, namely FatFormer. Our approach tuned on 4-class ProGAN data attains an average of 98% accuracy to unseen GANs, and surprisingly generalizes to unseen diffusion models with 95% accuracy.
arXiv Detail & Related papers (2023-12-27T17:36:32Z)
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z)
DiffUCD:Unsupervised Hyperspectral Image Change Detection with Semantic Correlation Diffusion Model [46.68717345017946]
Hyperspectral image change detection (HSI-CD) has emerged as a crucial research area in remote sensing. We propose a novel unsupervised HSI-CD with semantic correlation diffusion model (DiffUCD) Our method can achieve comparable results to those fully supervised methods requiring numerous samples.
arXiv Detail & Related papers (2023-05-21T09:21:41Z)
DCN-T: Dual Context Network with Transformer for Hyperspectral Image Classification [109.09061514799413]
Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions. We propose a tri-spectral image generation pipeline that transforms HSI into high-quality tri-spectral images. Our proposed method outperforms state-of-the-art methods for HSI classification.
arXiv Detail & Related papers (2023-04-19T18:32:52Z)
Multi-manifold Attention for Vision Transformers [12.862540139118073]
Vision Transformers are very popular nowadays due to their state-of-the-art performance in several computer vision tasks. A novel attention mechanism, called multi-manifold multihead attention, is proposed in this work to substitute the vanilla self-attention of a Transformer.
arXiv Detail & Related papers (2022-07-18T12:53:53Z)
Hybrid Routing Transformer for Zero-Shot Learning [83.64532548391]
This paper presents a novel transformer encoder-decoder model, called hybrid routing transformer (HRT) We embed an active attention, which is constructed by both the bottom-up and the top-down dynamic routing pathways to generate the attribute-aligned visual feature. While in HRT decoder, we use static routing to calculate the correlation among the attribute-aligned visual features, the corresponding attribute semantics, and the class attribute vectors to generate the final class label predictions.
arXiv Detail & Related papers (2022-03-29T07:55:08Z)
Semantic-aligned Fusion Transformer for One-shot Object Detection [18.58772037047498]
One-shot object detection aims at detecting novel objects according to merely one given instance. Current approaches explore various feature fusions to obtain directly transferable meta-knowledge. We propose a simple but effective architecture named Semantic-aligned Fusion Transformer (SaFT) to resolve these issues.
arXiv Detail & Related papers (2022-03-17T05:38:47Z)
Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST) CST embedding HSI sparsity into deep learning for HSI reconstruction. In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.