A Task-guided, Implicitly-searched and Meta-initialized Deep Model for
Image Fusion
- URL: http://arxiv.org/abs/2305.15862v1
- Date: Thu, 25 May 2023 08:54:08 GMT
- Title: A Task-guided, Implicitly-searched and Meta-initialized Deep Model for
Image Fusion
- Authors: Risheng Liu, Zhu Liu, Jinyuan Liu, Xin Fan, Zhongxuan Luo
- Abstract summary: We present a Task-guided, Implicit-searched and Meta- generalizationd (TIM) deep model to address the image fusion problem in a challenging real-world scenario.
Specifically, we propose a constrained strategy to incorporate information from downstream tasks to guide the unsupervised learning process of image fusion.
Within this framework, we then design an implicit search scheme to automatically discover compact architectures for our fusion model with high efficiency.
- Score: 69.10255211811007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image fusion plays a key role in a variety of multi-sensor-based vision
systems, especially for enhancing visual quality and/or extracting aggregated
features for perception. However, most existing methods just consider image
fusion as an individual task, thus ignoring its underlying relationship with
these downstream vision problems. Furthermore, designing proper fusion
architectures often requires huge engineering labor. It also lacks mechanisms
to improve the flexibility and generalization ability of current fusion
approaches. To mitigate these issues, we establish a Task-guided,
Implicit-searched and Meta-initialized (TIM) deep model to address the image
fusion problem in a challenging real-world scenario. Specifically, we first
propose a constrained strategy to incorporate information from downstream tasks
to guide the unsupervised learning process of image fusion. Within this
framework, we then design an implicit search scheme to automatically discover
compact architectures for our fusion model with high efficiency. In addition, a
pretext meta initialization technique is introduced to leverage divergence
fusion data to support fast adaptation for different kinds of image fusion
tasks. Qualitative and quantitative experimental results on different
categories of image fusion problems and related downstream tasks (e.g., visual
enhancement and semantic understanding) substantiate the flexibility and
effectiveness of our TIM. The source code will be available at
https://github.com/LiuZhu-CV/TIMFusion.
Related papers
- Rethinking Normalization Strategies and Convolutional Kernels for Multimodal Image Fusion [25.140475569677758]
Multimodal image fusion aims to integrate information from different modalities to obtain a comprehensive image.
Existing methods tend to prioritize natural image fusion and focus on information complementary and network training strategies.
This paper dissects the significant differences between the two tasks regarding fusion goals, statistical properties, and data distribution.
arXiv Detail & Related papers (2024-11-15T08:36:24Z) - Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond [74.96466744512992]
The essence of image fusion is to integrate complementary information from source images.
DeFusion++ produces versatile fused representations that can enhance the quality of image fusion and the effectiveness of downstream high-level vision tasks.
arXiv Detail & Related papers (2024-10-16T06:28:49Z) - From Text to Pixels: A Context-Aware Semantic Synergy Solution for
Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images.
Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and
Beyond [50.556961575275345]
We build an image fusion module to fuse complementary characteristics and cascade dual task-related modules.
We develop an efficient first-order approximation to compute corresponding gradients and present dynamic weighted aggregation to balance the gradients for fusion learning.
arXiv Detail & Related papers (2023-05-11T10:55:34Z) - LRRNet: A Novel Representation Learning Guided Fusion Network for
Infrared and Visible Images [98.36300655482196]
We formulate the fusion task mathematically, and establish a connection between its optimal solution and the network architecture that can implement it.
In particular we adopt a learnable representation approach to the fusion task, in which the construction of the fusion network architecture is guided by the optimisation algorithm producing the learnable model.
Based on this novel network architecture, an end-to-end lightweight fusion network is constructed to fuse infrared and visible light images.
arXiv Detail & Related papers (2023-04-11T12:11:23Z) - Unsupervised Image Fusion Method based on Feature Mutual Mapping [16.64607158983448]
We propose an unsupervised adaptive image fusion method to address the above issues.
We construct a global map to measure the connections of pixels between the input source images.
Our method achieves superior performance in both visual perception and objective evaluation.
arXiv Detail & Related papers (2022-01-25T07:50:14Z) - TransFuse: A Unified Transformer-based Image Fusion Framework using
Self-supervised Learning [5.849513679510834]
Image fusion is a technique to integrate information from multiple source images with complementary information to improve the richness of a single image.
Two-stage methods avoid the need of large amount of task-specific training data by training encoder-decoder network on large natural image datasets.
We propose a destruction-reconstruction based self-supervised training scheme to encourage the network to learn task-specific features.
arXiv Detail & Related papers (2022-01-19T07:30:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.