Generative Multi-Focus Image Fusion
- URL: http://arxiv.org/abs/2512.21495v1
- Date: Thu, 25 Dec 2025 04:00:01 GMT
- Title: Generative Multi-Focus Image Fusion
- Authors: Xinzhe Xie, Buyu Guo, Bolin Li, Shuangyan He, Yanzhen Gu, Qingyan Jiang, Peiliang Li,
- Abstract summary: We propose a generative multi-focus image fusion framework, termed GMFF, which operates in two sequential stages.<n>In the first stage, deterministic fusion is implemented using StackMFF V4, the latest version of the StackMFF series.<n>The second stage, generative restoration, is realized through IFControlNet, which leverages the generative capabilities of latent diffusion models.
- Score: 1.8570182093040675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-focus image fusion aims to generate an all-in-focus image from a sequence of partially focused input images. Existing fusion algorithms generally assume that, for every spatial location in the scene, there is at least one input image in which that location is in focus. Furthermore, current fusion models often suffer from edge artifacts caused by uncertain focus estimation or hard-selection operations in complex real-world scenarios. To address these limitations, we propose a generative multi-focus image fusion framework, termed GMFF, which operates in two sequential stages. In the first stage, deterministic fusion is implemented using StackMFF V4, the latest version of the StackMFF series, and integrates the available focal plane information to produce an initial fused image. The second stage, generative restoration, is realized through IFControlNet, which leverages the generative capabilities of latent diffusion models to reconstruct content from missing focal planes, restore fine details, and eliminate edge artifacts. Each stage is independently developed and functions seamlessly in a cascaded manner. Extensive experiments demonstrate that GMFF achieves state-of-the-art fusion performance and exhibits significant potential for practical applications, particularly in scenarios involving complex multi-focal content. The implementation is publicly available at https://github.com/Xinzhe99/StackMFF-Series.
Related papers
- Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach [99.80480649258557]
DiTFuse is an instruction-driven framework that performs semantics-aware fusion within a single model.<n>Experiments on public IVIF, MFF, and MEF benchmarks confirm superior quantitative and qualitative performance, sharper textures, and better semantic retention.
arXiv Detail & Related papers (2025-12-08T05:04:54Z) - Gradient-based multi-focus image fusion with focus-aware saliency enhancement [18.335216974790754]
Multi-focus image fusion (MFIF) aims to yield an all-focused image from multiple partially focused inputs.<n>We propose a MFIF method based on significant boundary enhancement, which generates high-quality fused boundaries.<n>Our method consistently outperforms 12 state-of-the-art methods in both subjective and objective evaluations.
arXiv Detail & Related papers (2025-09-26T14:20:44Z) - Conditional Controllable Image Fusion [56.4120974322286]
conditional controllable fusion (CCF) framework for general image fusion tasks without specific training.
CCF employs specific fusion constraints for each individual in practice.
Experiments validate our effectiveness in general fusion tasks across diverse scenarios.
arXiv Detail & Related papers (2024-11-03T13:56:15Z) - Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond [74.96466744512992]
The essence of image fusion is to integrate complementary information from source images.
DeFusion++ produces versatile fused representations that can enhance the quality of image fusion and the effectiveness of downstream high-level vision tasks.
arXiv Detail & Related papers (2024-10-16T06:28:49Z) - A Dual Domain Multi-exposure Image Fusion Network based on the
Spatial-Frequency Integration [57.14745782076976]
Multi-exposure image fusion aims to generate a single high-dynamic image by integrating images with different exposures.
We propose a novelty perspective on multi-exposure image fusion via the Spatial-Frequency Integration Framework, named MEF-SFI.
Our method achieves visual-appealing fusion results against state-of-the-art multi-exposure image fusion approaches.
arXiv Detail & Related papers (2023-12-17T04:45:15Z) - Bridging the Gap between Multi-focus and Multi-modal: A Focused
Integration Framework for Multi-modal Image Fusion [5.417493475406649]
Multi-modal image fusion (MMIF) integrates valuable information from different modality images into a fused one.
This paper proposes a MMIF framework for joint focused integration and modalities information extraction.
The proposed algorithm can surpass the state-of-the-art methods in visual perception and quantitative evaluation.
arXiv Detail & Related papers (2023-11-03T12:58:39Z) - Generation and Recombination for Multifocus Image Fusion with Free
Number of Inputs [17.32596568119519]
Multifocus image fusion is an effective way to overcome the limitation of optical lenses.
Previous methods assume that the focused areas of the two source images are complementary, making it impossible to achieve simultaneous fusion of multiple images.
In GRFusion, focus property detection of each source image can be implemented independently, enabling simultaneous fusion of multiple source images.
arXiv Detail & Related papers (2023-09-09T01:47:56Z) - Multi-modal Gated Mixture of Local-to-Global Experts for Dynamic Image
Fusion [59.19469551774703]
Infrared and visible image fusion aims to integrate comprehensive information from multiple sources to achieve superior performances on various practical tasks.
We propose a dynamic image fusion framework with a multi-modal gated mixture of local-to-global experts.
Our model consists of a Mixture of Local Experts (MoLE) and a Mixture of Global Experts (MoGE) guided by a multi-modal gate.
arXiv Detail & Related papers (2023-02-02T20:06:58Z) - Image Fusion Transformer [75.71025138448287]
In image fusion, images obtained from different sensors are fused to generate a single image with enhanced information.
In recent years, state-of-the-art methods have adopted Convolution Neural Networks (CNNs) to encode meaningful features for image fusion.
We propose a novel Image Fusion Transformer (IFT) where we develop a transformer-based multi-scale fusion strategy.
arXiv Detail & Related papers (2021-07-19T16:42:49Z) - UFA-FUSE: A novel deep supervised and hybrid model for multi-focus image
fusion [4.105749631623888]
Traditional and deep learning-based fusion methods generate the intermediate decision map through a series of post-processing procedures.
Inspired by the image reconstruction techniques based on deep learning, we propose a multi-focus image fusion network framework.
We show that the proposed approach for multi-focus image fusion achieves remarkable fusion performance compared to 19 state-of-the-art fusion methods.
arXiv Detail & Related papers (2021-01-12T14:33:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.