FuseVis: Interpreting neural networks for image fusion using per-pixel
saliency visualization
- URL: http://arxiv.org/abs/2012.08932v1
- Date: Sun, 6 Dec 2020 10:03:02 GMT
- Title: FuseVis: Interpreting neural networks for image fusion using per-pixel
saliency visualization
- Authors: Nishant Kumar, Stefan Gumhold
- Abstract summary: Unsupervised learning based convolutional neural networks (CNNs) have been utilized for different types of image fusion tasks.
It is challenging to analyze the reliability of these CNNs for the image fusion tasks since no groundtruth is available.
We present a novel real-time visualization tool, named FuseVis, with which the end-user can compute per-pixel saliency maps.
- Score: 10.156766309614113
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image fusion helps in merging two or more images to construct a more
informative single fused image. Recently, unsupervised learning based
convolutional neural networks (CNN) have been utilized for different types of
image fusion tasks such as medical image fusion, infrared-visible image fusion
for autonomous driving as well as multi-focus and multi-exposure image fusion
for satellite imagery. However, it is challenging to analyze the reliability of
these CNNs for the image fusion tasks since no groundtruth is available. This
led to the use of a wide variety of model architectures and optimization
functions yielding quite different fusion results. Additionally, due to the
highly opaque nature of such neural networks, it is difficult to explain the
internal mechanics behind its fusion results. To overcome these challenges, we
present a novel real-time visualization tool, named FuseVis, with which the
end-user can compute per-pixel saliency maps that examine the influence of the
input image pixels on each pixel of the fused image. We trained several image
fusion based CNNs on medical image pairs and then using our FuseVis tool, we
performed case studies on a specific clinical application by interpreting the
saliency maps from each of the fusion methods. We specifically visualized the
relative influence of each input image on the predictions of the fused image
and showed that some of the evaluated image fusion methods are better suited
for the specific clinical application. To the best of our knowledge, currently,
there is no approach for visual analysis of neural networks for image fusion.
Therefore, this work opens up a new research direction to improve the
interpretability of deep fusion networks. The FuseVis tool can also be adapted
in other deep neural network based image processing applications to make them
interpretable.
Related papers
- FusionINN: Decomposable Image Fusion for Brain Tumor Monitoring [6.45135260209391]
We introduce FusionINN, a novel decomposable image fusion framework.
We are the first to investigate the decomposability of fused images.
Our approach offers faster and qualitatively better fusion results.
arXiv Detail & Related papers (2024-03-23T08:54:03Z) - Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing [72.45257414889478]
We aim to reduce human workload by predicting connectivity between over-segmented neuron pieces.
We first construct a dataset, named FlyTracing, that contains millions of pairwise connections of segments expanding the whole fly brain.
We propose a novel connectivity-aware contrastive learning method to generate dense volumetric EM image embedding.
arXiv Detail & Related papers (2024-01-05T19:45:12Z) - From Text to Pixels: A Context-Aware Semantic Synergy Solution for
Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images.
Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z) - Multi-modal Medical Neurological Image Fusion using Wavelet Pooled Edge
Preserving Autoencoder [3.3828292731430545]
This paper presents an end-to-end unsupervised fusion model for multimodal medical images based on an edge-preserving dense autoencoder network.
In the proposed model, feature extraction is improved by using wavelet decomposition-based attention pooling of feature maps.
The proposed model is trained on a variety of medical image pairs which helps in capturing the intensity distributions of the source images.
arXiv Detail & Related papers (2023-10-18T11:59:35Z) - Learning a Graph Neural Network with Cross Modality Interaction for
Image Fusion [23.296468921842948]
Infrared and visible image fusion has gradually proved to be a vital fork in the field of multi-modality imaging technologies.
We propose an interactive graph neural network (GNN)-based architecture between cross modality for fusion, called IGNet.
Our IGNet can generate visually appealing fused images while scoring averagely 2.59% mAP@.5 and 7.77% mIoU higher in detection and segmentation.
arXiv Detail & Related papers (2023-08-07T02:25:06Z) - A Task-guided, Implicitly-searched and Meta-initialized Deep Model for
Image Fusion [69.10255211811007]
We present a Task-guided, Implicit-searched and Meta- generalizationd (TIM) deep model to address the image fusion problem in a challenging real-world scenario.
Specifically, we propose a constrained strategy to incorporate information from downstream tasks to guide the unsupervised learning process of image fusion.
Within this framework, we then design an implicit search scheme to automatically discover compact architectures for our fusion model with high efficiency.
arXiv Detail & Related papers (2023-05-25T08:54:08Z) - An Interactively Reinforced Paradigm for Joint Infrared-Visible Image
Fusion and Saliency Object Detection [59.02821429555375]
This research focuses on the discovery and localization of hidden objects in the wild and serves unmanned systems.
Through empirical analysis, infrared and visible image fusion (IVIF) enables hard-to-find objects apparent.
multimodal salient object detection (SOD) accurately delineates the precise spatial location of objects within the picture.
arXiv Detail & Related papers (2023-05-17T06:48:35Z) - LRRNet: A Novel Representation Learning Guided Fusion Network for
Infrared and Visible Images [98.36300655482196]
We formulate the fusion task mathematically, and establish a connection between its optimal solution and the network architecture that can implement it.
In particular we adopt a learnable representation approach to the fusion task, in which the construction of the fusion network architecture is guided by the optimisation algorithm producing the learnable model.
Based on this novel network architecture, an end-to-end lightweight fusion network is constructed to fuse infrared and visible light images.
arXiv Detail & Related papers (2023-04-11T12:11:23Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - A Dual-branch Network for Infrared and Visible Image Fusion [20.15854042473049]
We propose a new method based on dense blocks and GANs.
We directly insert the input image-visible light image in each layer of the entire network.
Our experiments show that the fused images obtained by our approach achieve good score based on multiple evaluation indicators.
arXiv Detail & Related papers (2021-01-24T04:18:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.