Related papers: CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification

CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification

URL: http://arxiv.org/abs/2511.12346v1
Date: Sat, 15 Nov 2025 20:25:59 GMT
Title: CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification
Authors: Asmit Bandyopadhyay, Anindita Das Bhattacharjee, Rakesh Das,
Abstract summary: CLAReSNet is a hybrid architecture that integrates multi-scale convolutional extraction with transformer-style attention via an adaptive latent bottleneck.<n> Experiments conducted on the Indian Pines and Salinas datasets show state-of-the-art performance, achieving overall accuracies of 99.71% and 99.96%.
Score: 20.37811669228711
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hyperspectral image (HSI) classification faces critical challenges, including high spectral dimensionality, complex spectral-spatial correlations, and limited training samples with severe class imbalance. While CNNs excel at local feature extraction and transformers capture long-range dependencies, their isolated application yields suboptimal results due to quadratic complexity and insufficient inductive biases. We propose CLAReSNet (Convolutional Latent Attention Residual Spectral Network), a hybrid architecture that integrates multi-scale convolutional extraction with transformer-style attention via an adaptive latent bottleneck. The model employs a multi-scale convolutional stem with deep residual blocks and an enhanced Convolutional Block Attention Module for hierarchical spatial features, followed by spectral encoder layers combining bidirectional RNNs (LSTM/GRU) with Multi-Scale Spectral Latent Attention (MSLA). MSLA reduces complexity from $\mathcal{O}(T^2D)$ to $\mathcal{O}(T\log(T)D)$ by adaptive latent token allocation (8-64 tokens) that scales logarithmically with the sequence length. Hierarchical cross-attention fusion dynamically aggregates multi-level representations for robust classification. Experiments conducted on the Indian Pines and Salinas datasets show state-of-the-art performance, achieving overall accuracies of 99.71% and 99.96%, significantly surpassing HybridSN, SSRN, and SpectralFormer. The learned embeddings exhibit superior inter-class separability and compact intra-class clustering, validating CLAReSNet's effectiveness under limited samples and severe class imbalance.

Related papers

VP-Hype: A Hybrid Mamba-Transformer Framework with Visual-Textual Prompting for Hyperspectral Image Classification [8.232394238006167]
VP-Hype is a framework that rethinks HSI classification by unifying the linear-time efficiency of State-Space Models with the relational modeling of Transformers.<n>Building on a robust 3D-CNN spectral front-end, VP-Hype replaces conventional attention blocks with a Hybrid Mamba-Transformer backbone.<n>With a training sample distribution of only 2%, the model achieves Overall Accuracy (OA) of 99.69% on the Salinas dataset and 99.45% on the Longkou dataset.
arXiv Detail & Related papers (2026-03-01T16:24:09Z)
HSSDCT: Factorized Spatial-Spectral Correlation for Hyperspectral Image Fusion [11.994592153994482]
Hyperspectral image (HSI) fusion aims to reconstruct a high-resolution HSI (HR-HSI) by combining the rich spectral information of a low-resolution HSI with the fine details of a high-resolution multispectral image (HR-MSI)<n>Recent deep learning methods have achieved notable progress, but they still suffer from limited receptive fields, redundant spectral bands, and the quadratic complexity of self-attention.<n>We propose the Hierarchical Spatial-Spectral Dense Correlation Network (HSSDCT) to overcome these challenges.
arXiv Detail & Related papers (2026-01-31T03:24:03Z)
Hyperspectral Image Classification using Spectral-Spatial Mixer Network [2.538209532048867]
This paper introduces SS-MixNet, a lightweight and effective deep learning model for hyperspectral image (HSI) classification.<n>The architecture integrates 3D convolutional layers for local spectral-spatial feature extraction with two parallel-style mixer blocks.<n>The model is evaluated on the QUH-Tangdaowan and QUH-Qingyun datasets using only 1% of labeled data.
arXiv Detail & Related papers (2025-11-19T18:48:52Z)
Spatial-Spectral Binarized Neural Network for Panchromatic and Multi-spectral Images Fusion [25.15016853820625]
Deep learning models have achieved excellent performance, but they often come with high computational complexity.<n>In this paper, we explore the feasibility of applying the binary neural network (BNN) to pan-sharpening.<n>A series of S2B-Conv form a brand-new binary network for pan-sharpening, dubbed as S2BNet.
arXiv Detail & Related papers (2025-09-27T14:10:51Z)
CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification [3.821081081400729]
Current convolutional neural networks (CNNs) focus on local features in hyperspectral data.<n> Transformer framework excels at extracting global features from hyperspectral imagery.<n>This research introduces the Convolutional Meet Transformer Network (CMTNet)
arXiv Detail & Related papers (2024-06-20T07:56:51Z)
SpectralMamba: Efficient Mamba for Hyperspectral Image Classification [39.18999103115206]
Recurrent neural networks and Transformers have dominated most applications in hyperspectral (HS) imaging. We propose SpectralMamba -- a novel state space model incorporated efficient deep learning framework for HS image classification. We show that SpectralMamba surprisingly creates promising win-wins from both performance and efficiency perspectives.
arXiv Detail & Related papers (2024-04-12T14:12:03Z)
Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising. Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z)
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z)
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising [64.11157141177208]
We propose a spectral enhanced rectangle Transformer to model the spatial and spectral correlation in hyperspectral images. For the former, we exploit the rectangle self-attention horizontally and vertically to capture the non-local similarity in the spatial domain. For the latter, we design a spectral enhancement module that is capable of extracting global underlying low-rank property of spatial-spectral cubes to suppress noise.
arXiv Detail & Related papers (2023-04-03T09:42:13Z)
Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST) CST embedding HSI sparsity into deep learning for HSI reconstruction. In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z)
PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer [76.44375136492827]
Convolutional Neural Networks (CNNs) are often scale-sensitive. We bridge this regret by exploiting multi-scale features in a finer granularity. The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates.
arXiv Detail & Related papers (2020-07-13T05:14:11Z)
Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI. Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.