Transformer for Multitemporal Hyperspectral Image Unmixing
- URL: http://arxiv.org/abs/2407.10427v1
- Date: Mon, 15 Jul 2024 04:02:01 GMT
- Title: Transformer for Multitemporal Hyperspectral Image Unmixing
- Authors: Hang Li, Qiankun Dong, Xueshuo Xie, Xia Xu, Tao Li, Zhenwei Shi,
- Abstract summary: We propose the Multitemporal Hyperspectral Image Unmixing Transformer (MUFormer), an end-to-end unsupervised deep learning model.
We introduce two key modules: the Global Awareness Module (GAM) and the Change Enhancement Module (CEM)
The synergy between these modules allows for capturing semantic information regarding endmember and abundance changes.
- Score: 17.365895881435563
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multitemporal hyperspectral image unmixing (MTHU) holds significant importance in monitoring and analyzing the dynamic changes of surface. However, compared to single-temporal unmixing, the multitemporal approach demands comprehensive consideration of information across different phases, rendering it a greater challenge. To address this challenge, we propose the Multitemporal Hyperspectral Image Unmixing Transformer (MUFormer), an end-to-end unsupervised deep learning model. To effectively perform multitemporal hyperspectral image unmixing, we introduce two key modules: the Global Awareness Module (GAM) and the Change Enhancement Module (CEM). The Global Awareness Module computes self-attention across all phases, facilitating global weight allocation. On the other hand, the Change Enhancement Module dynamically learns local temporal changes by comparing endmember changes between adjacent phases. The synergy between these modules allows for capturing semantic information regarding endmember and abundance changes, thereby enhancing the effectiveness of multitemporal hyperspectral image unmixing. We conducted experiments on one real dataset and two synthetic datasets, demonstrating that our model significantly enhances the effect of multitemporal hyperspectral image unmixing.
Related papers
- Ada-MSHyper: Adaptive Multi-Scale Hypergraph Transformer for Time Series Forecasting [5.431115840202783]
We propose Adaptive Multi-Scale Hypergraph Transformer (Ada-MSHyper) for time series forecasting.
Ada-MSHyper achieves state-of-the-art performance, reducing prediction errors by an average of 4.56%, 10.38%, and 4.97% in MSE for long-range, short-range, and ultra-long-range time series forecasting.
arXiv Detail & Related papers (2024-10-31T14:51:09Z) - Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging [111.8456671452411]
Multi-task learning (MTL) leverages a shared model to accomplish multiple tasks and facilitate knowledge transfer.
We propose a Weight-Ensembling Mixture of Experts (WEMoE) method for multi-task model merging.
We show that WEMoE and E-WEMoE outperform state-of-the-art (SOTA) model merging methods in terms of MTL performance, generalization, and robustness.
arXiv Detail & Related papers (2024-10-29T07:16:31Z) - Integrated Dynamic Phenological Feature for Remote Sensing Image Land Cover Change Detection [5.109855690325439]
We introduce the InPhea model, which integrates phenological features into a remote sensing image CD framework.
A constrainer with four constraint modules and a multi-stage contrastive learning approach is employed to aid in the model's understanding of phenological characteristics.
Experiments on the HRSCD, SECD, and PSCD-Wuhan datasets reveal that InPhea outperforms other models.
arXiv Detail & Related papers (2024-08-08T01:07:28Z) - Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment [20.902935570581207]
We introduce a Multimodal Alignment and Reconstruction Network (MARNet) to enhance the model's resistance to visual noise.
MARNet includes a cross-modal diffusion reconstruction module for smoothly and stably blending information across different domains.
Experiments conducted on two benchmark datasets, Vireo-Food172 and Ingredient-101, demonstrate that MARNet effectively improves the quality of image information extracted by the model.
arXiv Detail & Related papers (2024-07-26T16:30:18Z) - Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model [62.337749660637755]
We present change data generators based on generative models which are cheap and automatic.
Changen2 is a generative change foundation model that can be trained at scale via self-supervision.
The resulting model possesses inherent zero-shot change detection capabilities and excellent transferability.
arXiv Detail & Related papers (2024-06-26T01:03:39Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - A Dual Domain Multi-exposure Image Fusion Network based on the
Spatial-Frequency Integration [57.14745782076976]
Multi-exposure image fusion aims to generate a single high-dynamic image by integrating images with different exposures.
We propose a novelty perspective on multi-exposure image fusion via the Spatial-Frequency Integration Framework, named MEF-SFI.
Our method achieves visual-appealing fusion results against state-of-the-art multi-exposure image fusion approaches.
arXiv Detail & Related papers (2023-12-17T04:45:15Z) - Scalable Multi-Temporal Remote Sensing Change Data Generation via
Simulating Stochastic Change Process [21.622442722863028]
We present a scalable multi-temporal remote sensing change data generator via generative modeling.
Our main idea is to simulate a change process over time.
To solve these two problems, we present the change generator (Changen), a GAN-based GPCM, enabling controllable object change data generation.
arXiv Detail & Related papers (2023-09-29T07:37:26Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - TSI: Temporal Saliency Integration for Video Action Recognition [32.18535820790586]
We propose a Temporal Saliency Integration (TSI) block, which mainly contains a Salient Motion Excitation (SME) module and a Cross-scale Temporal Integration (CTI) module.
SME aims to highlight the motion-sensitive area through local-global motion modeling.
CTI is designed to perform multi-scale temporal modeling through a group of separate 1D convolutions respectively.
arXiv Detail & Related papers (2021-06-02T11:43:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.