CMT: Cross Modulation Transformer with Hybrid Loss for Pansharpening
- URL: http://arxiv.org/abs/2404.01121v1
- Date: Mon, 1 Apr 2024 13:55:44 GMT
- Title: CMT: Cross Modulation Transformer with Hybrid Loss for Pansharpening
- Authors: Wen-Jie Shu, Hong-Xia Dou, Rui Wen, Xiao Wu, Liang-Jian Deng,
- Abstract summary: Pansharpening aims to enhance remote sensing image (RSI) quality by merging high-resolution panchromatic (PAN) with multispectral (MS) images.
Prior techniques struggled to optimally fuse PAN and MS images for enhanced spatial and spectral information.
We present the Cross Modulation Transformer (CMT), a pioneering method that modifies the attention mechanism.
- Score: 14.459280238141849
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pansharpening aims to enhance remote sensing image (RSI) quality by merging high-resolution panchromatic (PAN) with multispectral (MS) images. However, prior techniques struggled to optimally fuse PAN and MS images for enhanced spatial and spectral information, due to a lack of a systematic framework capable of effectively coordinating their individual strengths. In response, we present the Cross Modulation Transformer (CMT), a pioneering method that modifies the attention mechanism. This approach utilizes a robust modulation technique from signal processing, integrating it into the attention mechanism's calculations. It dynamically tunes the weights of the carrier's value (V) matrix according to the modulator's features, thus resolving historical challenges and achieving a seamless integration of spatial and spectral attributes. Furthermore, considering that RSI exhibits large-scale features and edge details along with local textures, we crafted a hybrid loss function that combines Fourier and wavelet transforms to effectively capture these characteristics, thereby enhancing both spatial and spectral accuracy in pansharpening. Extensive experiments demonstrate our framework's superior performance over existing state-of-the-art methods. The code will be publicly available to encourage further research.
Related papers
- Multi-Head Attention Residual Unfolded Network for Model-Based Pansharpening [2.874893537471256]
Unfolding fusion methods integrate the powerful representation capabilities of deep learning with the robustness of model-based approaches.
In this paper, we propose a model-based deep unfolded method for satellite image fusion.
Experimental results on PRISMA, Quickbird, and WorldView2 datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2024-09-04T13:05:00Z) - Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local Enhancement [51.557804095896174]
We introduce a State Space Model with Across-Scanning and Local Enhancement, named ASLE-SSM, that employs a Spatial-Spectral SSM for global-local balanced context encoding and cross-channel interaction promoting.
Experimental results illustrate ASLE-SSM's superiority over existing state-of-the-art methods, with an inference speed 2.4 times faster than Transformer-based MST and saving 0.12 (M) of parameters.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods.
Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based
Transformer Network for Remote Sensing Image Super-Resolution [13.894645293832044]
Transformer-based models have shown competitive performance in remote sensing image super-resolution (RSISR)
We propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR.
Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages.
arXiv Detail & Related papers (2023-07-06T13:19:06Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - HyperTransformer: A Textural and Spectral Feature Fusion Transformer for
Pansharpening [60.89777029184023]
Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution.
Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions.
We present a novel attention mechanism for pansharpening called HyperTransformer, in which features of LR-HSI and PAN are formulated as queries and keys in a transformer, respectively.
arXiv Detail & Related papers (2022-03-04T18:59:08Z) - Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image
Reconstruction [127.20208645280438]
Hyperspectral image (HSI) reconstruction aims to recover the 3D spatial-spectral signal from a 2D measurement.
Modeling the inter-spectra interactions is beneficial for HSI reconstruction.
Mask-guided Spectral-wise Transformer (MST) proposes a novel framework for HSI reconstruction.
arXiv Detail & Related papers (2021-11-15T16:59:48Z) - LADMM-Net: An Unrolled Deep Network For Spectral Image Fusion From
Compressive Data [6.230751621285322]
Hyperspectral (HS) and multispectral (MS) image fusion aims at estimating a high-resolution spectral image from a low-spatial-resolution HS image and a low-spectral-resolution MS image.
In this work, a deep learning architecture under the algorithm unrolling approach is proposed for solving the fusion problem from HS and MS compressive measurements.
arXiv Detail & Related papers (2021-03-01T12:04:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.