AdaFuse: Adaptive Medical Image Fusion Based on Spatial-Frequential
Cross Attention
- URL: http://arxiv.org/abs/2310.05462v2
- Date: Tue, 24 Oct 2023 01:02:12 GMT
- Title: AdaFuse: Adaptive Medical Image Fusion Based on Spatial-Frequential
Cross Attention
- Authors: Xianming Gu, Lihui Wang, Zeyu Deng, Ying Cao, Xingyu Huang and Yue-min
Zhu
- Abstract summary: We propose AdaFuse, in which multimodal image information is fused adaptively through frequency-guided attention mechanism.
The proposed method outperforms state-of-the-art methods in terms of both visual quality and quantitative metrics.
- Score: 6.910879180358217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-modal medical image fusion is essential for the precise clinical
diagnosis and surgical navigation since it can merge the complementary
information in multi-modalities into a single image. The quality of the fused
image depends on the extracted single modality features as well as the fusion
rules for multi-modal information. Existing deep learning-based fusion methods
can fully exploit the semantic features of each modality, they cannot
distinguish the effective low and high frequency information of each modality
and fuse them adaptively. To address this issue, we propose AdaFuse, in which
multimodal image information is fused adaptively through frequency-guided
attention mechanism based on Fourier transform. Specifically, we propose the
cross-attention fusion (CAF) block, which adaptively fuses features of two
modalities in the spatial and frequency domains by exchanging key and query
values, and then calculates the cross-attention scores between the spatial and
frequency features to further guide the spatial-frequential information fusion.
The CAF block enhances the high-frequency features of the different modalities
so that the details in the fused images can be retained. Moreover, we design a
novel loss function composed of structure loss and content loss to preserve
both low and high frequency information. Extensive comparison experiments on
several datasets demonstrate that the proposed method outperforms
state-of-the-art methods in terms of both visual quality and quantitative
metrics. The ablation experiments also validate the effectiveness of the
proposed loss and fusion strategy.
Related papers
- DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion [10.713089596405053]
Two-phase discriminative autoencoder framework, DAE-Fuse, generates sharp and natural fused images.
Experiments on public infrared-visible, medical image fusion, and downstream object detection datasets demonstrate our method's superiority and generalizability.
arXiv Detail & Related papers (2024-09-16T08:37:09Z) - Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model [2.507050016527729]
Tri-modal medical image fusion can provide a more comprehensive view of the disease's shape, location, and biological activity.
Due to the limitations of imaging equipment and considerations for patient safety, the quality of medical images is usually limited.
There is an urgent need for a technology that can both enhance image resolution and integrate multi-modal information.
arXiv Detail & Related papers (2024-04-26T12:13:41Z) - A Dual Domain Multi-exposure Image Fusion Network based on the
Spatial-Frequency Integration [57.14745782076976]
Multi-exposure image fusion aims to generate a single high-dynamic image by integrating images with different exposures.
We propose a novelty perspective on multi-exposure image fusion via the Spatial-Frequency Integration Framework, named MEF-SFI.
Our method achieves visual-appealing fusion results against state-of-the-art multi-exposure image fusion approaches.
arXiv Detail & Related papers (2023-12-17T04:45:15Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM)
Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - TFormer: A throughout fusion transformer for multi-modal skin lesion
diagnosis [6.899641625551976]
We introduce a pure transformer-based method, which we refer to as Throughout Fusion Transformer (TFormer)", for sufficient information intergration in MSLD.
We then carefully design a stack of dual-branch hierarchical multi-modal transformer (HMT) blocks to fuse information across different image modalities in a stage-by-stage way.
Our TFormer outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2022-11-21T12:07:05Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation
with Transformers [8.139069987207494]
We present TransFusion, a Transformer-based architecture to merge divergent multi-view imaging information using convolutional layers and powerful attention mechanisms.
In particular, the Divergent Fusion Attention (DiFA) module is proposed for rich cross-view context modeling and semantic dependency mining.
arXiv Detail & Related papers (2022-03-21T04:02:54Z) - Coupled Feature Learning for Multimodal Medical Image Fusion [42.23662451234756]
Multimodal image fusion aims to combine relevant information from images acquired with different sensors.
In this paper, we propose a novel multimodal image fusion method based on coupled dictionary learning.
arXiv Detail & Related papers (2021-02-17T09:13:28Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.