CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification
- URL: http://arxiv.org/abs/2511.12346v1
- Date: Sat, 15 Nov 2025 20:25:59 GMT
- Title: CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification
- Authors: Asmit Bandyopadhyay, Anindita Das Bhattacharjee, Rakesh Das,
- Abstract summary: CLAReSNet is a hybrid architecture that integrates multi-scale convolutional extraction with transformer-style attention via an adaptive latent bottleneck.<n> Experiments conducted on the Indian Pines and Salinas datasets show state-of-the-art performance, achieving overall accuracies of 99.71% and 99.96%.
- Score: 20.37811669228711
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperspectral image (HSI) classification faces critical challenges, including high spectral dimensionality, complex spectral-spatial correlations, and limited training samples with severe class imbalance. While CNNs excel at local feature extraction and transformers capture long-range dependencies, their isolated application yields suboptimal results due to quadratic complexity and insufficient inductive biases. We propose CLAReSNet (Convolutional Latent Attention Residual Spectral Network), a hybrid architecture that integrates multi-scale convolutional extraction with transformer-style attention via an adaptive latent bottleneck. The model employs a multi-scale convolutional stem with deep residual blocks and an enhanced Convolutional Block Attention Module for hierarchical spatial features, followed by spectral encoder layers combining bidirectional RNNs (LSTM/GRU) with Multi-Scale Spectral Latent Attention (MSLA). MSLA reduces complexity from $\mathcal{O}(T^2D)$ to $\mathcal{O}(T\log(T)D)$ by adaptive latent token allocation (8-64 tokens) that scales logarithmically with the sequence length. Hierarchical cross-attention fusion dynamically aggregates multi-level representations for robust classification. Experiments conducted on the Indian Pines and Salinas datasets show state-of-the-art performance, achieving overall accuracies of 99.71% and 99.96%, significantly surpassing HybridSN, SSRN, and SpectralFormer. The learned embeddings exhibit superior inter-class separability and compact intra-class clustering, validating CLAReSNet's effectiveness under limited samples and severe class imbalance.
Related papers
- VP-Hype: A Hybrid Mamba-Transformer Framework with Visual-Textual Prompting for Hyperspectral Image Classification [8.232394238006167]
VP-Hype is a framework that rethinks HSI classification by unifying the linear-time efficiency of State-Space Models with the relational modeling of Transformers.<n>Building on a robust 3D-CNN spectral front-end, VP-Hype replaces conventional attention blocks with a Hybrid Mamba-Transformer backbone.<n>With a training sample distribution of only 2%, the model achieves Overall Accuracy (OA) of 99.69% on the Salinas dataset and 99.45% on the Longkou dataset.
arXiv Detail & Related papers (2026-03-01T16:24:09Z) - HSSDCT: Factorized Spatial-Spectral Correlation for Hyperspectral Image Fusion [11.994592153994482]
Hyperspectral image (HSI) fusion aims to reconstruct a high-resolution HSI (HR-HSI) by combining the rich spectral information of a low-resolution HSI with the fine details of a high-resolution multispectral image (HR-MSI)<n>Recent deep learning methods have achieved notable progress, but they still suffer from limited receptive fields, redundant spectral bands, and the quadratic complexity of self-attention.<n>We propose the Hierarchical Spatial-Spectral Dense Correlation Network (HSSDCT) to overcome these challenges.
arXiv Detail & Related papers (2026-01-31T03:24:03Z) - Hyperspectral Image Classification using Spectral-Spatial Mixer Network [2.538209532048867]
This paper introduces SS-MixNet, a lightweight and effective deep learning model for hyperspectral image (HSI) classification.<n>The architecture integrates 3D convolutional layers for local spectral-spatial feature extraction with two parallel-style mixer blocks.<n>The model is evaluated on the QUH-Tangdaowan and QUH-Qingyun datasets using only 1% of labeled data.
arXiv Detail & Related papers (2025-11-19T18:48:52Z) - Spatial-Spectral Binarized Neural Network for Panchromatic and Multi-spectral Images Fusion [25.15016853820625]
Deep learning models have achieved excellent performance, but they often come with high computational complexity.<n>In this paper, we explore the feasibility of applying the binary neural network (BNN) to pan-sharpening.<n>A series of S2B-Conv form a brand-new binary network for pan-sharpening, dubbed as S2BNet.
arXiv Detail & Related papers (2025-09-27T14:10:51Z) - CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification [3.821081081400729]
Current convolutional neural networks (CNNs) focus on local features in hyperspectral data.<n> Transformer framework excels at extracting global features from hyperspectral imagery.<n>This research introduces the Convolutional Meet Transformer Network (CMTNet)
arXiv Detail & Related papers (2024-06-20T07:56:51Z) - SpectralMamba: Efficient Mamba for Hyperspectral Image Classification [39.18999103115206]
Recurrent neural networks and Transformers have dominated most applications in hyperspectral (HS) imaging.
We propose SpectralMamba -- a novel state space model incorporated efficient deep learning framework for HS image classification.
We show that SpectralMamba surprisingly creates promising win-wins from both performance and efficiency perspectives.
arXiv Detail & Related papers (2024-04-12T14:12:03Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Spectral Enhanced Rectangle Transformer for Hyperspectral Image
Denoising [64.11157141177208]
We propose a spectral enhanced rectangle Transformer to model the spatial and spectral correlation in hyperspectral images.
For the former, we exploit the rectangle self-attention horizontally and vertically to capture the non-local similarity in the spatial domain.
For the latter, we design a spectral enhancement module that is capable of extracting global underlying low-rank property of spatial-spectral cubes to suppress noise.
arXiv Detail & Related papers (2023-04-03T09:42:13Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale
Convolutional Layer [76.44375136492827]
Convolutional Neural Networks (CNNs) are often scale-sensitive.
We bridge this regret by exploiting multi-scale features in a finer granularity.
The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates.
arXiv Detail & Related papers (2020-07-13T05:14:11Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.