SALAD-Pan: Sensor-Agnostic Latent Adaptive Diffusion for Pan-Sharpening
- URL: http://arxiv.org/abs/2602.04473v1
- Date: Wed, 04 Feb 2026 12:01:07 GMT
- Title: SALAD-Pan: Sensor-Agnostic Latent Adaptive Diffusion for Pan-Sharpening
- Authors: Junjie Li, Congyang Ou, Haokui Zhang, Guoting Wei, Shengqin Jiang, Ying Li, Chunhua Shen,
- Abstract summary: SALAD-Pan is a sensor-agnostic latent space diffusion method for efficient pansharpening.<n>It trains a band-wise single-channel VAE to encode high-resolution multispectral images into compact latent representations.<n>It achieves high-precision fusion in the diffusion process and robust zero-shot (cross-sensor) capability.
- Score: 50.44337053599724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, diffusion models bring novel insights for Pan-sharpening and notably boost fusion precision. However, most existing models perform diffusion in the pixel space and train distinct models for different multispectral (MS) imagery, suffering from high latency and sensor-specific limitations. In this paper, we present SALAD-Pan, a sensor-agnostic latent space diffusion method for efficient pansharpening. Specifically, SALAD-Pan trains a band-wise single-channel VAE to encode high-resolution multispectral (HRMS) into compact latent representations, supporting MS images with various channel counts and establishing a basis for acceleration. Then spectral physical properties, along with PAN and MS images, are injected into the diffusion backbone through unidirectional and bidirectional interactive control structures respectively, achieving high-precision fusion in the diffusion process. Finally, a lightweight cross-spectral attention module is added to the central layer of diffusion model, reinforcing spectral connections to boost spectral consistency and further elevate fusion precision. Experimental results on GaoFen-2, QuickBird, and WorldView-3 demonstrate that SALAD-Pan outperforms state-of-the-art diffusion-based methods across all three datasets, attains a 2-3x inference speedup, and exhibits robust zero-shot (cross-sensor) capability.
Related papers
- Universal Pansharpening Foundation Model [67.10467574892282]
Pansharpening generates the high-resolution multi-spectral (MS) image by integrating spatial details from a texture-rich panchromatic (PAN) image and spectral attributes from a low-resolution MS image.<n>We present FoundPS, a universal pansharpening foundation model for satellite-agnostic and scene-robust fusion.
arXiv Detail & Related papers (2026-03-04T08:30:15Z) - DIFF-MF: A Difference-Driven Channel-Spatial State Space Model for Multi-Modal Image Fusion [51.07069814578009]
Multi-modal image fusion aims to integrate complementary information from multiple source images to produce high-quality fused images with enriched content.<n>We propose DIFF-MF, a novel difference-driven channel-spatial state space model for multi-modal image fusion.<n>Our method outperforms existing approaches in both visual quality and quantitative evaluation.
arXiv Detail & Related papers (2026-01-09T05:26:54Z) - FOD-Diff: 3D Multi-Channel Patch Diffusion Model for Fiber Orientation Distribution [48.932538822216436]
Estimating FOD from single-shell low angular resolution dMRI (LAR-FOD) is limited by accuracy, whereas estimating FOD from multi-shell high angular resolution dMRI (HAR-FOD) requires a long scanning time.<n>We propose a 3D multi-channel patch diffusion model to predict HAR-FOD from LAR-FOD.<n>Our method achieves the best performance in HAR-FOD prediction and outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2025-12-18T01:51:05Z) - CHORDS: Diffusion Sampling Accelerator with Multi-core Hierarchical ODE Solvers [72.23291099555459]
Diffusion-based generative models have become dominant generators of high-fidelity images and videos but remain limited by their computationally expensive inference procedures.<n>This paper explores a general, training-free, and model-agnostic acceleration strategy via multi-core parallelism.<n>ChoRDS significantly accelerates sampling across diverse large-scale image and video diffusion models, yielding up to 2.1x speedup with four cores, improving by 50% over baselines, and 2.9x speedup with eight cores, all without quality degradation.
arXiv Detail & Related papers (2025-07-21T05:48:47Z) - Kernel Space Diffusion Model for Efficient Remote Sensing Pansharpening [8.756657890124766]
Kernel Space Diffusion Model (KSDiff) is a novel approach that leverages diffusion processes in a latent space to generate convolutional kernels enriched with global contextual information.<n> Experiments on three widely used datasets, including WorldView-3, GaoFen-2, and QuickBird, demonstrate the superior performance of KSDiff both qualitatively and quantitatively.
arXiv Detail & Related papers (2025-05-25T06:25:31Z) - SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening [14.293042131263924]
We introduce a spatial-spectral integrated diffusion model for the remote sensing pansharpening task, called SSDiff.
SSDiff considers the pansharpening process as the fusion process of spatial and spectral components from the perspective of subspace decomposition.
arXiv Detail & Related papers (2024-04-17T16:30:56Z) - DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation [34.42067276754897]
DifF is a novel approach that leverages diffusion models for multi-modal fusion in 3D object detection and BEV map segmentation.
Benefiting from the inherent denoising property of diffusion, DifF is able to refine or even synthesize sensor features in case of sensor malfunction.
arXiv Detail & Related papers (2024-04-06T13:25:29Z) - Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models [82.8261101680427]
Smooth latent spaces ensure that a perturbation on an input latent corresponds to a steady change in the output image.
This property proves beneficial in downstream tasks, including image inversion, inversion, and editing.
We propose Smooth Diffusion, a new category of diffusion models that can be simultaneously high-performing and smooth.
arXiv Detail & Related papers (2023-12-07T16:26:23Z) - ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge [63.00793292863]
ToddlerDiffusion is a novel approach to decomposing the complex task of RGB image generation into simpler, interpretable stages.
Our method, termed ToddlerDiffusion, cascades modality-specific models, each responsible for generating an intermediate representation.
ToddlerDiffusion consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-11-24T15:20:01Z) - DiffUCD:Unsupervised Hyperspectral Image Change Detection with Semantic
Correlation Diffusion Model [46.68717345017946]
Hyperspectral image change detection (HSI-CD) has emerged as a crucial research area in remote sensing.
We propose a novel unsupervised HSI-CD with semantic correlation diffusion model (DiffUCD)
Our method can achieve comparable results to those fully supervised methods requiring numerous samples.
arXiv Detail & Related papers (2023-05-21T09:21:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.