Joint-Individual Fusion Structure with Fusion Attention Module for
Multi-Modal Skin Cancer Classification
- URL: http://arxiv.org/abs/2312.04189v1
- Date: Thu, 7 Dec 2023 10:16:21 GMT
- Title: Joint-Individual Fusion Structure with Fusion Attention Module for
Multi-Modal Skin Cancer Classification
- Authors: Peng Tang, Xintong Yan, Yang Nan, Xiaobin Hu, Xiaobin Hu, Bjoern H
Menzee.Sebastian Krammer, Tobias Lasser
- Abstract summary: We propose a new fusion method that combines dermatological images and patient metadata for skin cancer classification.
First, we propose a joint-individual fusion (JIF) structure that learns the shared features of multi-modality data.
Second, we introduce a fusion attention (FA) module that enhances the most relevant image and metadata features.
- Score: 10.959827268372422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most convolutional neural network (CNN) based methods for skin cancer
classification obtain their results using only dermatological images. Although
good classification results have been shown, more accurate results can be
achieved by considering the patient's metadata, which is valuable clinical
information for dermatologists. Current methods only use the simple joint
fusion structure (FS) and fusion modules (FMs) for the multi-modal
classification methods, there still is room to increase the accuracy by
exploring more advanced FS and FM. Therefore, in this paper, we design a new
fusion method that combines dermatological images (dermoscopy images or
clinical images) and patient metadata for skin cancer classification from the
perspectives of FS and FM. First, we propose a joint-individual fusion (JIF)
structure that learns the shared features of multi-modality data and preserves
specific features simultaneously. Second, we introduce a fusion attention (FA)
module that enhances the most relevant image and metadata features based on
both the self and mutual attention mechanism to support the decision-making
pipeline. We compare the proposed JIF-MMFA method with other state-of-the-art
fusion methods on three different public datasets. The results show that our
JIF-MMFA method improves the classification results for all tested CNN
backbones and performs better than the other fusion methods on the three public
datasets, demonstrating our method's effectiveness and robustness
Related papers
- Fuse4Seg: Image-Level Fusion Based Multi-Modality Medical Image Segmentation [13.497613339200184]
We argue the current feature-level fusion strategy is prone to semantic inconsistencies and misalignments.
We introduce a novel image-level fusion based multi-modality medical image segmentation method, Fuse4Seg.
The resultant fused image is a coherent representation that accurately amalgamates information from all modalities.
arXiv Detail & Related papers (2024-09-16T14:39:04Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction [4.659998272408215]
Early detection of a pulmonary embolism (PE) is critical for enhancing patient survival rates.
We suggest a multimodal fusion methodology, termed PE-MVCNet, which capitalizes on Computed Tomography Pulmonary Angiography imaging and EMR data.
Our proposed model outperforms existing methodologies, corroborating that our multimodal fusion model excels compared to models that use a single data modality.
arXiv Detail & Related papers (2024-02-27T03:53:27Z) - Three-Dimensional Medical Image Fusion with Deformable Cross-Attention [10.26573411162757]
Multimodal medical image fusion plays an instrumental role in several areas of medical image processing.
Traditional fusion methods tend to process each modality independently before combining the features and reconstructing the fusion image.
In this study, we introduce an innovative unsupervised feature mutual learning fusion network designed to rectify these limitations.
arXiv Detail & Related papers (2023-10-10T04:10:56Z) - MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection [54.52102265418295]
We propose a novel and effective Multi-Level Fusion network, named as MLF-DET, for high-performance cross-modal 3D object DETection.
For the feature-level fusion, we present the Multi-scale Voxel Image fusion (MVI) module, which densely aligns multi-scale voxel features with image features.
For the decision-level fusion, we propose the lightweight Feature-cued Confidence Rectification (FCR) module, which exploits image semantics to rectify the confidence of detection candidates.
arXiv Detail & Related papers (2023-07-18T11:26:02Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - CIFF-Net: Contextual Image Feature Fusion for Melanoma Diagnosis [0.4129225533930966]
Melanoma is considered to be the deadliest variant of skin cancer causing around 75% of total skin cancer deaths.
To diagnose Melanoma, clinicians assess and compare multiple skin lesions of the same patient concurrently.
This concurrent multi-image comparative method has not been explored by existing deep learning-based schemes.
arXiv Detail & Related papers (2023-03-07T06:16:10Z) - Multimodal Information Fusion for Glaucoma and DR Classification [1.5616442980374279]
Multimodal information is frequently available in medical tasks. By combining information from multiple sources, clinicians are able to make more accurate judgments.
Our paper investigates three multimodal information fusion strategies based on deep learning to solve retinal analysis tasks.
arXiv Detail & Related papers (2022-09-02T12:19:03Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z) - SAG-GAN: Semi-Supervised Attention-Guided GANs for Data Augmentation on
Medical Images [47.35184075381965]
We present a data augmentation method for generating synthetic medical images using cycle-consistency Generative Adversarial Networks (GANs)
The proposed GANs-based model can generate a tumor image from a normal image, and in turn, it can also generate a normal image from a tumor image.
We train the classification model using real images with classic data augmentation methods and classification models using synthetic images.
arXiv Detail & Related papers (2020-11-15T14:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.