Related papers: Fuse4Seg: Image-Level Fusion Based Multi-Modality Medical Image Segmentation

Fuse4Seg: Image-Level Fusion Based Multi-Modality Medical Image Segmentation

URL: http://arxiv.org/abs/2409.10328v2
Date: Tue, 17 Sep 2024 02:35:24 GMT
Title: Fuse4Seg: Image-Level Fusion Based Multi-Modality Medical Image Segmentation
Authors: Yuchen Guo, Weifeng Su,
Abstract summary: We argue the current feature-level fusion strategy is prone to semantic inconsistencies and misalignments. We introduce a novel image-level fusion based multi-modality medical image segmentation method, Fuse4Seg. The resultant fused image is a coherent representation that accurately amalgamates information from all modalities.
Score: 13.497613339200184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although multi-modality medical image segmentation holds significant potential for enhancing the diagnosis and understanding of complex diseases by integrating diverse imaging modalities, existing methods predominantly rely on feature-level fusion strategies. We argue the current feature-level fusion strategy is prone to semantic inconsistencies and misalignments across various imaging modalities because it merges features at intermediate layers in a neural network without evaluative control. To mitigate this, we introduce a novel image-level fusion based multi-modality medical image segmentation method, Fuse4Seg, which is a bi-level learning framework designed to model the intertwined dependencies between medical image segmentation and medical image fusion. The image-level fusion process is seamlessly employed to guide and enhance the segmentation results through a layered optimization approach. Besides, the knowledge gained from the segmentation module can effectively enhance the fusion module. This ensures that the resultant fused image is a coherent representation that accurately amalgamates information from all modalities. Moreover, we construct a BraTS-Fuse benchmark based on BraTS dataset, which includes 2040 paired original images, multi-modal fusion images, and ground truth. This benchmark not only serves image-level medical segmentation but is also the largest dataset for medical image fusion to date. Extensive experiments on several public datasets and our benchmark demonstrate the superiority of our approach over prior state-of-the-art (SOTA) methodologies.

Related papers

Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach [99.80480649258557]
DiTFuse is an instruction-driven framework that performs semantics-aware fusion within a single model.<n>Experiments on public IVIF, MFF, and MEF benchmarks confirm superior quantitative and qualitative performance, sharper textures, and better semantic retention.
arXiv Detail & Related papers (2025-12-08T05:04:54Z)
Multimodal Medical Image Classification via Synergistic Learning Pre-training [20.818508328120974]
We propose a novel framework for multimodal semi-supervised medical image classification.<n>By treating one modality as an augmented sample of another modality, we implement a self-supervised learning pre-train.<n>During the fine-tuning stage, we set different encoders to extract features from the original modalities.
arXiv Detail & Related papers (2025-09-22T08:21:19Z)
Multimodal Medical Endoscopic Image Analysis via Progressive Disentangle-aware Contrastive Learning [11.158864816564538]
We present an innovative multi-modality representation learning framework based on the Align-Disentangle-Fusion' mechanism.<n>Our method consistently outperforms state-of-the-art approaches, achieving superior accuracy across diverse real clinical scenarios.
arXiv Detail & Related papers (2025-08-23T03:02:51Z)
SMFusion: Semantic-Preserving Fusion of Multimodal Medical Images for Enhanced Clinical Diagnosis [11.356721356096564]
We propose a novel semantic-guided medical image fusion approach that incorporates medical prior knowledge into the fusion process.<n>We generate diagnostic reports from the fused images to assess the preservation of medical information.<n> Experimental results on test datasets demonstrate that the proposed method achieves superior performance in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-05-18T06:15:00Z)
Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond [74.96466744512992]
The essence of image fusion is to integrate complementary information from source images. DeFusion++ produces versatile fused representations that can enhance the quality of image fusion and the effectiveness of downstream high-level vision tasks.
arXiv Detail & Related papers (2024-10-16T06:28:49Z)
From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images. Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z)
A New Multimodal Medical Image Fusion based on Laplacian Autoencoder with Channel Attention [3.1531360678320897]
Deep learning models have achieved end-to-end image fusion with highly robust and accurate performance. Most DL-based fusion models perform down-sampling on the input images to minimize the number of learnable parameters and computations. We propose a new multimodal medical image fusion model is proposed that is based on integrated Laplacian-Gaussian concatenation with attention pooling.
arXiv Detail & Related papers (2023-10-18T11:29:53Z)
Three-Dimensional Medical Image Fusion with Deformable Cross-Attention [10.26573411162757]
Multimodal medical image fusion plays an instrumental role in several areas of medical image processing. Traditional fusion methods tend to process each modality independently before combining the features and reconstructing the fusion image. In this study, we introduce an innovative unsupervised feature mutual learning fusion network designed to rectify these limitations.
arXiv Detail & Related papers (2023-10-10T04:10:56Z)
Multi-interactive Feature Learning and a Full-time Multi-modality Benchmark for Image Fusion and Segmentation [66.15246197473897]
Multi-modality image fusion and segmentation play a vital role in autonomous driving and robotic operation. We propose a textbfMulti-textbfinteractive textbfFeature learning architecture for image fusion and textbfSegmentation.
arXiv Detail & Related papers (2023-08-04T01:03:58Z)
Modality-Agnostic Learning for Medical Image Segmentation Using Multi-modality Self-distillation [1.815047691981538]
We propose a novel framework, Modality-Agnostic learning through Multi-modality Self-dist-illation (MAG-MS) MAG-MS distills knowledge from the fusion of multiple modalities and applies it to enhance representation learning for individual modalities. Our experiments on benchmark datasets demonstrate the high efficiency of MAG-MS and its superior segmentation performance.
arXiv Detail & Related papers (2023-06-06T14:48:50Z)
Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning. Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations. Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z)
M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image. Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z)
Coupled Feature Learning for Multimodal Medical Image Fusion [42.23662451234756]
Multimodal image fusion aims to combine relevant information from images acquired with different sensors. In this paper, we propose a novel multimodal image fusion method based on coupled dictionary learning.
arXiv Detail & Related papers (2021-02-17T09:13:28Z)
A review: Deep learning for medical image segmentation using multi-modality fusion [4.4259821861544]
Multi-modality is widely used in medical imaging, because it can provide multiinformation about a target. Deep learning-based approaches have presented the state-of-the-art performance in image classification, segmentation, object detection and tracking tasks. In this paper, we give an overview of deep learning-based approaches for multi-modal medical image segmentation task.
arXiv Detail & Related papers (2020-04-22T16:00:53Z)
Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities. Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code. We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.