Related papers: PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening

PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening

URL: http://arxiv.org/abs/2505.23367v2
Date: Tue, 15 Jul 2025 06:54:06 GMT
Title: PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening
Authors: Jeonghyeok Do, Sungpyo Kim, Geunhyuk Youk, Jaehyup Lee, Munchurl Kim,
Abstract summary: We propose PAN-Crafter, a modality-consistent alignment framework.<n>At its core, Modality-Adaptive Reconstruction (MARs) enables a single network to jointly reconstruct HRMS and PAN images.<n> experiments on multiple benchmark datasets demonstrate that our PAN-Crafter outperforms the most recent state-of-the-art method in all metrics.
Score: 20.43260906326048
License: http://creativecommons.org/licenses/by/4.0/
Abstract: PAN-sharpening aims to fuse high-resolution panchromatic (PAN) images with low-resolution multi-spectral (MS) images to generate high-resolution multi-spectral (HRMS) outputs. However, cross-modality misalignment -- caused by sensor placement, acquisition timing, and resolution disparity -- induces a fundamental challenge. Conventional deep learning methods assume perfect pixel-wise alignment and rely on per-pixel reconstruction losses, leading to spectral distortion, double edges, and blurring when misalignment is present. To address this, we propose PAN-Crafter, a modality-consistent alignment framework that explicitly mitigates the misalignment gap between PAN and MS modalities. At its core, Modality-Adaptive Reconstruction (MARs) enables a single network to jointly reconstruct HRMS and PAN images, leveraging PAN's high-frequency details as auxiliary self-supervision. Additionally, we introduce Cross-Modality Alignment-Aware Attention (CM3A), a novel mechanism that bidirectionally aligns MS texture to PAN structure and vice versa, enabling adaptive feature refinement across modalities. Extensive experiments on multiple benchmark datasets demonstrate that our PAN-Crafter outperforms the most recent state-of-the-art method in all metrics, even with 50.11$\times$ faster inference time and 0.63$\times$ the memory size. Furthermore, it demonstrates strong generalization performance on unseen satellite datasets, showing its robustness across different conditions.

Related papers

Rotation Equivariant Arbitrary-scale Image Super-Resolution [62.41329042683779]
The arbitrary-scale image super-resolution (ASISR) aims to achieve arbitrary-scale high-resolution recoveries from a low-resolution input image.<n>We make efforts to construct a rotation equivariant ASISR method in this study.
arXiv Detail & Related papers (2025-08-07T08:51:03Z)
AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection [58.67129770371016]
We propose a novel IRSTD framework that reimagines the IRSTD paradigm by incorporating textual metadata for scene-aware optimization.<n>AuxDet consistently outperforms state-of-the-art methods, validating the critical role of auxiliary information in improving robustness and accuracy.
arXiv Detail & Related papers (2025-05-21T07:02:05Z)
Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution [52.55429225242423]
We propose a novel framework for Burst Image Super-Resolution (BISR), featuring an equivariant convolution-based alignment.<n>This enables the alignment transformation to be learned via explicit supervision in the image domain and easily applied in the feature domain.<n>Experiments on BISR benchmarks show the superior performance of our approach in both quantitative metrics and visual quality.
arXiv Detail & Related papers (2025-03-11T11:13:10Z)
Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image [51.333064033152304]
Recently launched satellites can concurrently acquire HSIs and panchromatic (PAN) images.<n>Hipandas is a novel learning paradigm that reconstructs HRHS images from noisy low-resolution HSIs and high-resolution PAN images.
arXiv Detail & Related papers (2024-12-05T14:39:29Z)
Multi-Head Attention Residual Unfolded Network for Model-Based Pansharpening [2.874893537471256]
Unfolding fusion methods integrate the powerful representation capabilities of deep learning with the robustness of model-based approaches. In this paper, we propose a model-based deep unfolded method for satellite image fusion. Experimental results on PRISMA, Quickbird, and WorldView2 datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2024-09-04T13:05:00Z)
MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo [8.303396507129266]
MSP-MVS is a method introducing multi-granularity segmentation prior to edge-confined patch deformation.<n>We implement equidistribution and disassemble-clustering of correlative reliable pixels.<n>We also introduce disparity-sampling synergistic 3D optimization to help identify global-minimum matching costs.
arXiv Detail & Related papers (2024-07-27T19:00:44Z)
CMT: Cross Modulation Transformer with Hybrid Loss for Pansharpening [14.459280238141849]
Pansharpening aims to enhance remote sensing image (RSI) quality by merging high-resolution panchromatic (PAN) with multispectral (MS) images. Prior techniques struggled to optimally fuse PAN and MS images for enhanced spatial and spectral information. We present the Cross Modulation Transformer (CMT), a pioneering method that modifies the attention mechanism.
arXiv Detail & Related papers (2024-04-01T13:55:44Z)
PC-GANs: Progressive Compensation Generative Adversarial Networks for Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information. The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z)
HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening [60.89777029184023]
Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. We present a novel attention mechanism for pansharpening called HyperTransformer, in which features of LR-HSI and PAN are formulated as queries and keys in a transformer, respectively.
arXiv Detail & Related papers (2022-03-04T18:59:08Z)
SIPSA-Net: Shift-Invariant Pan Sharpening with Moving Object Alignment for Satellite Imagery [36.24121979886052]
Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image. Due to the different sensors' locations, characteristics and acquisition time, PAN and MS image pairs often tend to have various amounts of misalignment. We propose shift-invariant pan-sharpening with moving object alignment (SIPSA-Net) which is the first method to take into account such large misalignment of moving object regions for PAN sharpening.
arXiv Detail & Related papers (2021-05-06T02:27:50Z)
Fast and High-Quality Blind Multi-Spectral Image Pansharpening [48.68143888901669]
We propose a fast approach to blind pansharpening and achieve state-of-the-art image reconstruction quality. To achieve fast blind pansharpening, we decouple the solution of the blur kernel and of the HRMS image. Our algorithm outperforms state-of-the-art model-based counterparts in terms of both computational time and reconstruction quality.
arXiv Detail & Related papers (2021-03-17T23:12:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.