Related papers: HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

URL: http://arxiv.org/abs/2406.11519v1
Date: Mon, 17 Jun 2024 13:22:58 GMT
Title: HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
Authors: Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang,
Abstract summary: Hyper SIGMA is a vision transformer-based foundation model for HSI interpretation. It integrates spatial and spectral features using a specially designed spectral enhancement module. It shows significant advantages in scalability, robustness, cross-modal transferring capability, and real-world applicability.
Score: 88.13261547704444
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA, a vision transformer-based foundation model for HSI interpretation, scalable to over a billion parameters. To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, and real-world applicability.

Related papers

Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data [14.104497777255137]
We introduce Low-rank Efficient Spatial-Spectral Vision Transformer with three key innovations. We pretrain LESS ViT using a Hyperspectral Masked Autoencoder framework with integrated positional and channel masking strategies. Experimental results demonstrate that our proposed method achieves competitive performance against state-of-the-art multi-modal geospatial foundation models.
arXiv Detail & Related papers (2025-03-17T05:42:19Z)
Hybrid State-Space and GRU-based Graph Tokenization Mamba for Hyperspectral Image Classification [14.250184447492208]
Hyperspectral image (HSI) classification plays a pivotal role in domains such as environmental monitoring, agriculture, and urban planning. Traditional methods, including machine learning and convolutional neural networks (CNNs), often struggle to effectively capture these intricate spectral-spatial features. This work proposes GraphMamba, a hybrid model that combines spectral-spatial token generation, graph-based token prioritization, and cross-attention mechanisms.
arXiv Detail & Related papers (2025-02-10T13:02:19Z)
HRVMamba: High-Resolution Visual State Space Model for Dense Prediction [60.80423207808076]
State Space Models (SSMs) with efficient hardware-aware designs have demonstrated significant potential in computer vision tasks. These models have been constrained by three key challenges: insufficient inductive bias, long-range forgetting, and low-resolution output representation. We introduce the Dynamic Visual State Space (DVSS) block, which employs deformable convolution to mitigate the long-range forgetting problem. We also introduce High-Resolution Visual State Space Model (HRVMamba) based on the DVSS block, which preserves high-resolution representations throughout the entire process.
arXiv Detail & Related papers (2024-10-04T06:19:29Z)
HSIGene: A Foundation Model For Hyperspectral Image Generation [46.745198868466545]
Hyperspectral image (HSI) plays a vital role in various fields such as agriculture and environmental monitoring. Due to the expensive acquisition cost, the number of hyperspectral images is limited, degenerating the performance of downstream tasks. We propose HSIGene, a novel HSI generation foundation model which is based on latent diffusion and supports multi-condition control. Experiments demonstrate that the proposed model is capable of generating a vast quantity of realistic HSIs for downstream tasks such as denoising and super-resolution.
arXiv Detail & Related papers (2024-09-19T05:17:44Z)
Unsupervised Hyperspectral and Multispectral Image Blind Fusion Based on Deep Tucker Decomposition Network with Spatial-Spectral Manifold Learning [15.86617273658407]
We propose an unsupervised blind fusion method for hyperspectral and multispectral images based on Tucker decomposition and spatial spectral manifold learning (DTDNML) We show that this method enhances the accuracy and efficiency of hyperspectral and multispectral fusion on different remote sensing datasets.
arXiv Detail & Related papers (2024-09-15T08:58:26Z)
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation [74.65906322148997]
We introduce a new object detection method that integrates hypergraph computations to capture the complex high-order correlations among visual features. Hyper-YOLO significantly outperforms the advanced YOLOv8-N and YOLOv9T with 12% $textval$ and 9% $APMoonLab improvements.
arXiv Detail & Related papers (2024-08-09T01:21:15Z)
Cross-Scope Spatial-Spectral Information Aggregation for Hyperspectral Image Super-Resolution [47.12985199570964]
We propose a novel cross-scope spatial-spectral Transformer (CST) to investigate long-range spatial and spectral similarities for single hyperspectral image super-resolution. Specifically, we devise cross-attention mechanisms in spatial and spectral dimensions to comprehensively model the long-range spatial-spectral characteristics. Experiments over three hyperspectral datasets demonstrate that the proposed CST is superior to other state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2023-11-29T03:38:56Z)
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z)
Unsupervised Hyperspectral and Multispectral Images Fusion Based on the Cycle Consistency [21.233354336608205]
We propose an unsupervised HSI and MSI fusion model based on the cycle consistency, called CycFusion. The CycFusion learns the domain transformation between low spatial resolution HSI (LrHSI) and high spatial resolution MSI (HrMSI) Experiments conducted on several datasets show that our proposed model outperforms all compared unsupervised fusion methods.
arXiv Detail & Related papers (2023-07-07T06:47:15Z)
Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images. S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z)
Hyperspectral Image Segmentation based on Graph Processing over Multilayer Networks [51.15952040322895]
One important task of hyperspectral image (HSI) processing is the extraction of spectral-spatial features. We propose several approaches to HSI segmentation based on M-GSP feature extraction. Our experimental results demonstrate the strength of M-GSP in HSI processing and spectral-spatial information extraction.
arXiv Detail & Related papers (2021-11-29T23:28:18Z)
Interpretable Hyperspectral AI: When Non-Convex Modeling meets Hyperspectral Remote Sensing [57.52865154829273]
Hyperspectral imaging, also known as image spectrometry, is a landmark technique in geoscience remote sensing (RS) In the past decade efforts have been made to process analyze these hyperspectral (HS) products mainly by means of seasoned experts. For this reason, it is urgent to develop more intelligent and automatic approaches for various HS RS applications.
arXiv Detail & Related papers (2021-03-02T03:32:10Z)
Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI. Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)
Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral Imagery [79.69449412334188]
In this paper, we investigate how to adapt state-of-the-art residual learning based single gray/RGB image super-resolution approaches. We introduce a spatial-spectral prior network (SSPN) to fully exploit the spatial information and the correlation between the spectra of the hyperspectral data. Experimental results on some hyperspectral images demonstrate that the proposed SSPSR method enhances the details of the recovered high-resolution hyperspectral images.
arXiv Detail & Related papers (2020-05-18T14:25:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.