Dig2DIG: Dig into Diffusion Information Gains for Image Fusion
- URL: http://arxiv.org/abs/2503.18627v1
- Date: Mon, 24 Mar 2025 12:43:11 GMT
- Title: Dig2DIG: Dig into Diffusion Information Gains for Image Fusion
- Authors: Bing Cao, Baoshuo Cai, Changqing Zhang, Qinghua Hu,
- Abstract summary: We introduce diffusion information gains (DIG) to quantify the information contribution of each modality at different denoising steps.<n>Our method outperforms existing diffusion-based approaches in terms of both fusion quality and inference efficiency.
- Score: 46.504772732456196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image fusion integrates complementary information from multi-source images to generate more informative results. Recently, the diffusion model, which demonstrates unprecedented generative potential, has been explored in image fusion. However, these approaches typically incorporate predefined multimodal guidance into diffusion, failing to capture the dynamically changing significance of each modality, while lacking theoretical guarantees. To address this issue, we reveal a significant spatio-temporal imbalance in image denoising; specifically, the diffusion model produces dynamic information gains in different image regions with denoising steps. Based on this observation, we Dig into the Diffusion Information Gains (Dig2DIG) and theoretically derive a diffusion-based dynamic image fusion framework that provably reduces the upper bound of the generalization error. Accordingly, we introduce diffusion information gains (DIG) to quantify the information contribution of each modality at different denoising steps, thereby providing dynamic guidance during the fusion process. Extensive experiments on multiple fusion scenarios confirm that our method outperforms existing diffusion-based approaches in terms of both fusion quality and inference efficiency.
Related papers
- Dream-IF: Dynamic Relative EnhAnceMent for Image Fusion [48.06078830638296]
We introduce the concept of dominant regions for image enhancement and present a Dynamic Relative EnhAnceMent framework for Image Fusion (Dream-IF)
This framework quantifies the relative dominance of each modality across different layers and leverages this information to facilitate reciprocal cross-modal enhancement.
We employ prompt-based encoding to capture degradation-specific details, which dynamically steer the restoration process and promote coordinated enhancement.
arXiv Detail & Related papers (2025-03-13T07:08:35Z) - SOWing Information: Cultivating Contextual Coherence with MLLMs in Image Generation [29.49217721233131]
diffusion generative models simulate a random walk in the data space along the denoising trajectory.<n>This allows information to diffuse across regions, yielding outcomes.<n>However, the chaotic and disordered nature of information diffusion in diffusion models often results in undesired interference between image regions, causing degraded detail preservation and contextual inconsistency.<n>We reframing disordered diffusion as a powerful tool for text-vision-to-image generation (TV2I) tasks, achieving pixel-level condition fidelity while maintaining visual and semantic coherence throughout the image.
arXiv Detail & Related papers (2024-11-28T14:35:25Z) - Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas [33.334956022229846]
We propose the Merge-Attend-Diffuse operator, which can be plugged into different types of pretrained diffusion models used in a joint diffusion setting.
Specifically, we merge the diffusion paths, reprogramming self- and cross-attention to operate on the aggregated latent space.
Our method maintains compatibility with the input prompt and visual quality of the generated images while increasing their semantic coherence.
arXiv Detail & Related papers (2024-08-28T09:22:32Z) - A Dual Domain Multi-exposure Image Fusion Network based on the
Spatial-Frequency Integration [57.14745782076976]
Multi-exposure image fusion aims to generate a single high-dynamic image by integrating images with different exposures.
We propose a novelty perspective on multi-exposure image fusion via the Spatial-Frequency Integration Framework, named MEF-SFI.
Our method achieves visual-appealing fusion results against state-of-the-art multi-exposure image fusion approaches.
arXiv Detail & Related papers (2023-12-17T04:45:15Z) - Global Structure-Aware Diffusion Process for Low-Light Image Enhancement [64.69154776202694]
This paper studies a diffusion-based framework to address the low-light image enhancement problem.
We advocate for the regularization of its inherent ODE-trajectory.
Experimental evaluations reveal that the proposed framework attains distinguished performance in low-light enhancement.
arXiv Detail & Related papers (2023-10-26T17:01:52Z) - SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired
Image-to-Image Translation [96.11061713135385]
This work presents a new score-decomposed diffusion model to explicitly optimize the tangled distributions during image generation.
We equalize the refinement parts of the score function and energy guidance, which permits multi-objective optimization on the manifold.
SDDM outperforms existing SBDM-based methods with much fewer diffusion steps on several I2I benchmarks.
arXiv Detail & Related papers (2023-08-04T06:21:57Z) - DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion [7.06521373423708]
Denosing diffusion model, as a generative model, has received a lot of attention in the field of image generation.
We introduce diffusion model to the image fusion field, treating the image fusion task as image-to-image translation.
Our method can inspire other works and gain insight into this field to better apply the diffusion model to image fusion tasks.
arXiv Detail & Related papers (2023-04-10T12:28:27Z) - DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM)
Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z) - Unifying Diffusion Models' Latent Space, with Applications to
CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains.
Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.