Unsupervised Hyperspectral and Multispectral Image Fusion via Self-Supervised Modality Decoupling
- URL: http://arxiv.org/abs/2412.04802v3
- Date: Tue, 22 Apr 2025 04:20:45 GMT
- Title: Unsupervised Hyperspectral and Multispectral Image Fusion via Self-Supervised Modality Decoupling
- Authors: Songcheng Du, Yang Zou, Zixu Wang, Xingyuan Li, Ying Li, Changjing Shang, Qiang Shen,
- Abstract summary: Hyperspectral and Multispectral Image Fusion (HMIF) aims to fuse low-resolution hyperspectral images (LR-HSIs) and high-resolution multispectral images (HR-MSIs)<n>HMIF methods typically apply direct fusion from the two modalities without effective supervision.<n>We propose an end-to-end self-supervised textbfModality-Decoupled textbfSpatial-textbfSpectral Fusion (textbfMossFuse) framework.
- Score: 16.923899489263945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperspectral and Multispectral Image Fusion (HMIF) aims to fuse low-resolution hyperspectral images (LR-HSIs) and high-resolution multispectral images (HR-MSIs) to reconstruct high spatial and high spectral resolution images. Current methods typically apply direct fusion from the two modalities without effective supervision, leading to an incomplete perception of deep modality-complementary information and a limited understanding of inter-modality correlations. To address these issues, we propose a simple yet effective solution for unsupervised HMIF, revealing that modality decoupling is key to improving fusion performance. Specifically, we propose an end-to-end self-supervised \textbf{Mo}dality-Decoupled \textbf{S}patial-\textbf{S}pectral Fusion (\textbf{MossFuse}) framework that decouples shared and complementary information across modalities and aggregates a concise representation of both LR-HSIs and HR-MSIs to reduce modality redundancy. Also, we introduce the subspace clustering loss as a clear guide to decouple modality-shared features from modality-complementary ones. Systematic experiments over multiple datasets demonstrate that our simple and effective approach consistently outperforms the existing HMIF methods while requiring considerably fewer parameters with reduced inference time. The anonymous source code is in \href{https://github.com/dusongcheng/MossFuse}{MossFuse}.
Related papers
- FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.
We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF)
PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models.
FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - Unsupervised Hyperspectral and Multispectral Image Blind Fusion Based on Deep Tucker Decomposition Network with Spatial-Spectral Manifold Learning [15.86617273658407]
We propose an unsupervised blind fusion method for hyperspectral and multispectral images based on Tucker decomposition and spatial spectral manifold learning (DTDNML)
We show that this method enhances the accuracy and efficiency of hyperspectral and multispectral fusion on different remote sensing datasets.
arXiv Detail & Related papers (2024-09-15T08:58:26Z) - Physics-Inspired Degradation Models for Hyperspectral Image Fusion [61.743696362028246]
Most fusion methods solely focus on the fusion algorithm itself and overlook the degradation models.
We propose physics-inspired degradation models (PIDM) to model the degradation of LR-HSI and HR-MSI.
Our proposed PIDM can boost the fusion performance of existing fusion methods in practical scenarios.
arXiv Detail & Related papers (2024-02-04T09:07:28Z) - Multi-Depth Branch Network for Efficient Image Super-Resolution [12.042706918188566]
A longstanding challenge in Super-Resolution (SR) is how to efficiently enhance high-frequency details in Low-Resolution (LR) images.
We propose an asymmetric SR architecture featuring Multi-Depth Branch Module (MDBM)
MDBMs contain branches of different depths, designed to capture high- and low-frequency information simultaneously and efficiently.
arXiv Detail & Related papers (2023-09-29T15:46:25Z) - Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for
Loss-free Multi-Exposure Image Fusion [60.221404321514086]
Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels.
This paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions.
arXiv Detail & Related papers (2023-09-03T08:07:26Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Compound Attention and Neighbor Matching Network for Multi-contrast MRI
Super-resolution [7.197850827700436]
Multi-contrast super-resolution of MRI can achieve better results than single-image super-resolution.
We propose a novel network architecture with compound-attention and neighbor matching (CANM-Net) for multi-contrast MRI SR.
CANM-Net outperforms state-of-the-art approaches in both retrospective and prospective experiments.
arXiv Detail & Related papers (2023-07-05T09:44:02Z) - A Task-guided, Implicitly-searched and Meta-initialized Deep Model for
Image Fusion [69.10255211811007]
We present a Task-guided, Implicit-searched and Meta- generalizationd (TIM) deep model to address the image fusion problem in a challenging real-world scenario.
Specifically, we propose a constrained strategy to incorporate information from downstream tasks to guide the unsupervised learning process of image fusion.
Within this framework, we then design an implicit search scheme to automatically discover compact architectures for our fusion model with high efficiency.
arXiv Detail & Related papers (2023-05-25T08:54:08Z) - An Interactively Reinforced Paradigm for Joint Infrared-Visible Image
Fusion and Saliency Object Detection [59.02821429555375]
This research focuses on the discovery and localization of hidden objects in the wild and serves unmanned systems.
Through empirical analysis, infrared and visible image fusion (IVIF) enables hard-to-find objects apparent.
multimodal salient object detection (SOD) accurately delineates the precise spatial location of objects within the picture.
arXiv Detail & Related papers (2023-05-17T06:48:35Z) - Implicit Diffusion Models for Continuous Super-Resolution [65.45848137914592]
This paper introduces an Implicit Diffusion Model (IDM) for high-fidelity continuous image super-resolution.
IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework.
The scaling factor regulates the resolution and accordingly modulates the proportion of the LR information and generated features in the final output.
arXiv Detail & Related papers (2023-03-29T07:02:20Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Multi-Scale Cascading Network with Compact Feature Learning for
RGB-Infrared Person Re-Identification [35.55895776505113]
Multi-Scale Part-Aware Cascading framework (MSPAC) is formulated by aggregating multi-scale fine-grained features from part to global.
Cross-modality correlations can thus be efficiently explored on salient features for distinctive modality-invariant feature learning.
arXiv Detail & Related papers (2020-12-12T15:39:11Z) - Coupled Convolutional Neural Network with Adaptive Response Function
Learning for Unsupervised Hyperspectral Super-Resolution [28.798775822331045]
Hyperspectral super-resolution refers to fusing HSI and MSI to generate an image with both high spatial and high spectral resolutions.
In this work, an unsupervised deep learning-based fusion method - HyCoNet - that can solve the problems in HSI-MSI fusion without the prior PSF and SRF information is proposed.
arXiv Detail & Related papers (2020-07-28T06:17:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.