CoMoFusion: Fast and High-quality Fusion of Infrared and Visible Image with Consistency Model
- URL: http://arxiv.org/abs/2405.20764v3
- Date: Wed, 12 Jun 2024 03:16:40 GMT
- Title: CoMoFusion: Fast and High-quality Fusion of Infrared and Visible Image with Consistency Model
- Authors: Zhiming Meng, Hui Li, Zeyang Zhang, Zhongwei Shen, Yunlong Yu, Xiaoning Song, Xiaojun Wu,
- Abstract summary: Current generative models based fusion methods often suffer from unstable training and slow inference speed.
CoMoFusion can generate the high-quality images and achieve fast image inference speed.
In order to enhance the texture and salient information of fused images, a novel loss based on pixel value selection is also designed.
- Score: 20.02742423120295
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models are widely utilized to model the distribution of fused images in the field of infrared and visible image fusion. However, current generative models based fusion methods often suffer from unstable training and slow inference speed. To tackle this problem, a novel fusion method based on consistency model is proposed, termed as CoMoFusion, which can generate the high-quality images and achieve fast image inference speed. In specific, the consistency model is used to construct multi-modal joint features in the latent space with the forward and reverse process. Then, the infrared and visible features extracted by the trained consistency model are fed into fusion module to generate the final fused image. In order to enhance the texture and salient information of fused images, a novel loss based on pixel value selection is also designed. Extensive experiments on public datasets illustrate that our method obtains the SOTA fusion performance compared with the existing fusion methods.
Related papers
- FusionBench: A Comprehensive Benchmark of Deep Model Fusion [78.80920533793595]
Deep model fusion is a technique that unifies the predictions or parameters of several deep neural networks into a single model.
FusionBench is the first comprehensive benchmark dedicated to deep model fusion.
arXiv Detail & Related papers (2024-06-05T13:54:28Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion [7.06521373423708]
Denosing diffusion model, as a generative model, has received a lot of attention in the field of image generation.
We introduce diffusion model to the image fusion field, treating the image fusion task as image-to-image translation.
Our method can inspire other works and gain insight into this field to better apply the diffusion model to image fusion tasks.
arXiv Detail & Related papers (2023-04-10T12:28:27Z) - DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM)
Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z) - Multi-modal Gated Mixture of Local-to-Global Experts for Dynamic Image
Fusion [59.19469551774703]
Infrared and visible image fusion aims to integrate comprehensive information from multiple sources to achieve superior performances on various practical tasks.
We propose a dynamic image fusion framework with a multi-modal gated mixture of local-to-global experts.
Our model consists of a Mixture of Local Experts (MoLE) and a Mixture of Global Experts (MoGE) guided by a multi-modal gate.
arXiv Detail & Related papers (2023-02-02T20:06:58Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Cross Attention-guided Dense Network for Images Fusion [6.722525091148737]
In this paper, we propose a novel cross attention-guided image fusion network.
It is a unified and unsupervised framework for multi-modal image fusion, multi-exposure image fusion, and multi-focus image fusion.
The results demonstrate that the proposed model outperforms the state-of-the-art quantitatively and qualitatively.
arXiv Detail & Related papers (2021-09-23T14:22:47Z) - Bayesian Fusion for Infrared and Visible Images [26.64101343489016]
In this paper, a novel Bayesian fusion model is established for infrared and visible images.
We aim at making the fused image satisfy human visual system.
Compared with the previous methods, the novel model can generate better fused images with high-light targets and rich texture details.
arXiv Detail & Related papers (2020-05-12T14:57:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.