Spectral Discrepancy and Cross-modal Semantic Consistency Learning for Object Detection in Hyperspectral Image
- URL: http://arxiv.org/abs/2512.18245v1
- Date: Sat, 20 Dec 2025 07:03:09 GMT
- Title: Spectral Discrepancy and Cross-modal Semantic Consistency Learning for Object Detection in Hyperspectral Image
- Authors: Xiao He, Chang Tang, Xinwang Liu, Wei Zhang, Zhimin Gao, Chuankun Li, Shaohua Qiu, Jiangfeng Xu,
- Abstract summary: Hyperspectral images with high spectral resolution provide new insights into recognizing subtle differences in similar substances.<n> object detection in hyperspectral images faces significant challenges in intra- and inter-class similarity due to the spatial differences in hyperspectral inter-bands.<n>We propose a novel network termed textbfSpectral textbfDiscrepancy and textbfCross-textbfModal semantic consistency learning (SDCM)<n>Our proposed method achieves state-of-the-art performance when compared with other ones.
- Score: 40.38555448650773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperspectral images with high spectral resolution provide new insights into recognizing subtle differences in similar substances. However, object detection in hyperspectral images faces significant challenges in intra- and inter-class similarity due to the spatial differences in hyperspectral inter-bands and unavoidable interferences, e.g., sensor noises and illumination. To alleviate the hyperspectral inter-bands inconsistencies and redundancy, we propose a novel network termed \textbf{S}pectral \textbf{D}iscrepancy and \textbf{C}ross-\textbf{M}odal semantic consistency learning (SDCM), which facilitates the extraction of consistent information across a wide range of hyperspectral bands while utilizing the spectral dimension to pinpoint regions of interest. Specifically, we leverage a semantic consistency learning (SCL) module that utilizes inter-band contextual cues to diminish the heterogeneity of information among bands, yielding highly coherent spectral dimension representations. On the other hand, we incorporate a spectral gated generator (SGG) into the framework that filters out the redundant data inherent in hyperspectral information based on the importance of the bands. Then, we design the spectral discrepancy aware (SDA) module to enrich the semantic representation of high-level information by extracting pixel-level spectral features. Extensive experiments on two hyperspectral datasets demonstrate that our proposed method achieves state-of-the-art performance when compared with other ones.
Related papers
- Hyperspectral Mamba for Hyperspectral Object Tracking [56.365517163296936]
A new hyperspectral object tracking network equipped with Mamba (HyMamba) is proposed.<n>It unifies spectral, cross-depth, and temporal modeling through state space modules (SSMs)<n>HyMamba achieves state-of-the-art performance on seven benchmark datasets.
arXiv Detail & Related papers (2025-09-10T03:47:43Z) - Spectrum-oriented Point-supervised Saliency Detector for Hyperspectral Images [13.79887292039637]
We introduce point supervision into Hyperspectral salient object detection (HSOD)<n>We incorporate Spectral Saliency, derived from conventional HSOD methods, as a pivotal spectral representation within the framework.<n>We propose a novel pipeline, specifically designed for HSIs, to generate pseudo-labels, effectively mitigating the performance decline associated with point supervision strategy.
arXiv Detail & Related papers (2024-12-24T02:52:43Z) - SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking [21.664141982246598]
Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously.<n>Existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction.<n>In this paper, a spatial-spectral fusion network with spectral angle awareness (SST-Net) is proposed for hyperspectral (HS) object tracking.
arXiv Detail & Related papers (2024-03-09T09:37:13Z) - Improving Vision Anomaly Detection with the Guidance of Language
Modality [64.53005837237754]
This paper tackles the challenges for vision modality from a multimodal point of view.
We propose Cross-modal Guidance (CMG) to tackle the redundant information issue and sparse space issue.
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality.
arXiv Detail & Related papers (2023-10-04T13:44:56Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Object Detection in Hyperspectral Image via Unified Spectral-Spatial
Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images.
S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z) - Hyperspectral Images Classification and Dimensionality Reduction using
spectral interaction and SVM classifier [0.0]
The high dimensionality of the hyperspectral images (HSI) is one of the main challenges for the analysis of the collected data.
The existence of noisy, redundant and irrelevant bands increases the computational complexity.
We propose a novel filter approach based on the spectral interaction measure and the support vector machines for dimensionality reduction.
arXiv Detail & Related papers (2022-10-27T15:37:57Z) - Spatial-Spectral Manifold Embedding of Hyperspectral Data [43.479889860715275]
We propose a novel hyperspectral embedding approach by simultaneously considering spatial and spectral information.
spatial-spectral manifold embedding (SSME) models the spatial and spectral information jointly in a patch-based fashion.
SSME not only learns the spectral embedding by using the adjacency matrix obtained by similarity measurement between spectral signatures, but also models the spatial neighbours of a target pixel in hyperspectral scene.
arXiv Detail & Related papers (2020-07-17T05:40:27Z) - Hyperspectral Image Super-resolution via Deep Progressive Zero-centric
Residual Learning [62.52242684874278]
Cross-modality distribution of spatial and spectral information makes the problem challenging.
We propose a novel textitlightweight deep neural network-based framework, namely PZRes-Net.
Our framework learns a high resolution and textitzero-centric residual image, which contains high-frequency spatial details of the scene.
arXiv Detail & Related papers (2020-06-18T06:32:11Z) - Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral
Imagery [79.69449412334188]
In this paper, we investigate how to adapt state-of-the-art residual learning based single gray/RGB image super-resolution approaches.
We introduce a spatial-spectral prior network (SSPN) to fully exploit the spatial information and the correlation between the spectra of the hyperspectral data.
Experimental results on some hyperspectral images demonstrate that the proposed SSPSR method enhances the details of the recovered high-resolution hyperspectral images.
arXiv Detail & Related papers (2020-05-18T14:25:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.