Related papers: GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection

GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection

URL: http://arxiv.org/abs/2304.08687v1
Date: Tue, 18 Apr 2023 01:43:17 GMT
Title: GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection
Authors: Meiqi Hu, Chen Wu, Liangpei Zhang
Abstract summary: High spectral resolution imagery of the Earth's surface enables users to monitor changes over time in fine-grained scale. Most current algorithms are still confined to describing local features and fail to incorporate a global perspective. We propose a Global Multi-head INteractive self-attention change Detection network (GlobalMind) to explore the implicit correlation between different surface objects and variant land cover transformations.
Score: 22.22495802857453
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: High spectral resolution imagery of the Earth's surface enables users to monitor changes over time in fine-grained scale, playing an increasingly important role in agriculture, defense, and emergency response. However, most current algorithms are still confined to describing local features and fail to incorporate a global perspective, which limits their ability to capture interactions between global features, thus usually resulting in incomplete change regions. In this paper, we propose a Global Multi-head INteractive self-attention change Detection network (GlobalMind) to explore the implicit correlation between different surface objects and variant land cover transformations, acquiring a comprehensive understanding of the data and accurate change detection result. Firstly, a simple but effective Global Axial Segmentation (GAS) strategy is designed to expand the self-attention computation along the row space or column space of hyperspectral images, allowing the global connection with high efficiency. Secondly, with GAS, the global spatial multi-head interactive self-attention (Global-M) module is crafted to mine the abundant spatial-spectral feature involving potential correlations between the ground objects from the entire rich and complex hyperspectral space. Moreover, to acquire the accurate and complete cross-temporal changes, we devise a global temporal interactive multi-head self-attention (GlobalD) module which incorporates the relevance and variation of bi-temporal spatial-spectral features, deriving the integrate potential same kind of changes in the local and global range with the combination of GAS. We perform extensive experiments on five mostly used hyperspectral datasets, and our method outperforms the state-of-the-art algorithms with high accuracy and efficiency.

Related papers

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z)
Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration [0.9790236766474198]
Local-Global Attention is designed to better integrate both local and global contextual features. We have thoroughly evaluated the Local-Global Attention mechanism on several widely used object detection and classification datasets.
arXiv Detail & Related papers (2024-11-14T17:22:16Z)
Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images [24.06927394483275]
We propose a stronger multifaceted collaborative salient object detector in ORSIs, termed LBA-MCNet. The network focuses on accurately locating targets, balancing detailed features, and modeling image-level global context information.
arXiv Detail & Related papers (2024-10-31T14:50:48Z)
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model [88.13261547704444]
Hyper SIGMA is a vision transformer-based foundation model that unifies HSI interpretation across tasks and scenes. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images.
arXiv Detail & Related papers (2024-06-17T13:22:58Z)
DRSI-Net: Dual-Residual Spatial Interaction Network for Multi-Person Pose Estimation [7.418828517897727]
We propose a dual-residual spatial interaction network (DRSI-Net) for MPPE with high accuracy and low complexity. Compared to other methods, DRSI-Net performs residual spatial information interactions on the neighbouring features. The proposed DRSI-Net outperforms other state-of-the-art methods in accuracy and complexity.
arXiv Detail & Related papers (2024-02-26T15:10:22Z)
Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition [54.334773598942775]
Domain shift poses a significant challenge in Cross-Domain Facial Expression Recognition (CD-FER) We propose an Adaptive Global-Local Representation Learning and Selection framework.
arXiv Detail & Related papers (2024-01-20T02:21:41Z)
Global Feature Pyramid Network [1.2473780585666772]
The visual feature pyramid has proven its effectiveness and efficiency in target detection tasks. Current methodologies tend to overly emphasize inter-layer feature interaction, neglecting the crucial aspect of intra-layer feature adjustment.
arXiv Detail & Related papers (2023-12-18T14:30:41Z)
Local-Global Temporal Difference Learning for Satellite Video Super-Resolution [55.69322525367221]
We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation. To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies. Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-10T07:04:40Z)
Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications. In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z)
Video Salient Object Detection via Adaptive Local-Global Refinement [7.723369608197167]
Video salient object detection (VSOD) is an important task in many vision applications. We propose an adaptive local-global refinement framework for VSOD. We show that our weighting methodology can further exploit the feature correlations, thus driving the network to learn more discriminative feature representation.
arXiv Detail & Related papers (2021-04-29T14:14:11Z)
A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC) First, spatial-temporal attention mechanism is presented to explore the most useful and important information. Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
Global Context-Aware Progressive Aggregation Network for Salient Object Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features. We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN [117.80737222754306]
We present a novel universal object detector called Universal-RCNN. We first generate a global semantic pool by integrating all high-level semantic representation of all the categories. An Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN.
arXiv Detail & Related papers (2020-02-18T07:57:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.