Related papers: Application of Multimodal Fusion Deep Learning Model in Disease Recognition

Application of Multimodal Fusion Deep Learning Model in Disease Recognition

URL: http://arxiv.org/abs/2406.18546v1
Date: Wed, 22 May 2024 23:09:49 GMT
Title: Application of Multimodal Fusion Deep Learning Model in Disease Recognition
Authors: Xiaoyi Liu, Hongjie Qiu, Muqing Li, Zhou Yu, Yutian Yang, Yafeng Yan,
Abstract summary: This paper introduces an innovative multi-modal fusion deep learning approach to overcome the drawbacks of traditional single-modal recognition techniques. During the feature extraction stage, cutting-edge deep learning models are applied to distill advanced features from image-based, temporal, and structured data sources. The findings demonstrate significant advantages of the multimodal fusion model across multiple evaluation metrics.
Score: 14.655086303102575
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces an innovative multi-modal fusion deep learning approach to overcome the drawbacks of traditional single-modal recognition techniques. These drawbacks include incomplete information and limited diagnostic accuracy. During the feature extraction stage, cutting-edge deep learning models including convolutional neural networks (CNN), recurrent neural networks (RNN), and transformers are applied to distill advanced features from image-based, temporal, and structured data sources. The fusion strategy component seeks to determine the optimal fusion mode tailored to the specific disease recognition task. In the experimental section, a comparison is made between the performance of the proposed multi-mode fusion model and existing single-mode recognition methods. The findings demonstrate significant advantages of the multimodal fusion model across multiple evaluation metrics.

Related papers

CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
NSF-MAP: Neurosymbolic Multimodal Fusion for Robust and Interpretable Anomaly Prediction in Assembly Pipelines [0.0]
This paper proposes a neurosymbolic AI and fusion-based approach for multimodal anomaly prediction in assembly pipelines.<n>We introduce a time series and image-based fusion model that leverages decision-level fusion techniques.<n>The results demonstrate that a neurosymbolic AI-based fusion approach that uses transfer learning can effectively harness the complementary strengths of time series and image data.
arXiv Detail & Related papers (2025-05-09T16:50:42Z)
MIND: Modality-Informed Knowledge Distillation Framework for Multimodal Clinical Prediction Tasks [50.98856172702256]
We propose the Modality-INformed knowledge Distillation (MIND) framework, a multimodal model compression approach. MIND transfers knowledge from ensembles of pre-trained deep neural networks of varying sizes into a smaller multimodal student. We evaluate MIND on binary and multilabel clinical prediction tasks using time series data and chest X-ray images.
arXiv Detail & Related papers (2025-02-03T08:50:00Z)
Edge-Enhanced Dilated Residual Attention Network for Multimodal Medical Image Fusion [13.029564509505676]
Multimodal medical image fusion is a crucial task that combines complementary information from different imaging modalities into a unified representation. While deep learning methods have significantly advanced fusion performance, some of the existing CNN-based methods fall short in capturing fine-grained multiscale and edge features. We propose a novel CNN-based architecture that addresses these limitations by introducing a Dilated Residual Attention Network Module for effective multiscale feature extraction.
arXiv Detail & Related papers (2024-11-18T18:11:53Z)
Multi-modal Medical Neurological Image Fusion using Wavelet Pooled Edge Preserving Autoencoder [3.3828292731430545]
This paper presents an end-to-end unsupervised fusion model for multimodal medical images based on an edge-preserving dense autoencoder network. In the proposed model, feature extraction is improved by using wavelet decomposition-based attention pooling of feature maps. The proposed model is trained on a variety of medical image pairs which helps in capturing the intensity distributions of the source images.
arXiv Detail & Related papers (2023-10-18T11:59:35Z)
Convolutional neural network based on sparse graph attention mechanism for MRI super-resolution [0.34410212782758043]
Medical image super-resolution (SR) reconstruction using deep learning techniques can enhance lesion analysis and assist doctors in improving diagnostic efficiency and accuracy. Existing deep learning-based SR methods rely on convolutional neural networks (CNNs), which inherently limit the expressive capabilities of these models. We propose an A-network that utilizes multiple convolution operator feature extraction modules (MCO) for extracting image features.
arXiv Detail & Related papers (2023-05-29T06:14:22Z)
Generalizing Multimodal Variational Methods to Sets [35.69942798534849]
This paper presents a novel variational method on sets called the Set Multimodal VAE (SMVAE) for learning a multimodal latent space. By modeling the joint-modality posterior distribution directly, the proposed SMVAE learns to exchange information between multiple modalities and compensate for the drawbacks caused by factorization.
arXiv Detail & Related papers (2022-12-19T23:50:19Z)
Evidence fusion with contextual discounting for multi-modality medical image segmentation [22.77837744216949]
The framework is composed of an encoder-decoder feature extraction module, an evidential segmentation module that computes a belief function at each voxel for each modality, and a multi-modality evidence fusion module. The method was evaluated on the BraTs 2021 database of 1251 patients with brain tumors.
arXiv Detail & Related papers (2022-06-23T14:36:50Z)
MMLatch: Bottom-up Top-down Fusion for Multimodal Sentiment Analysis [84.7287684402508]
Current deep learning approaches for multimodal fusion rely on bottom-up fusion of high and mid-level latent modality representations. Models of human perception highlight the importance of top-down fusion, where high-level representations affect the way sensory inputs are perceived. We propose a neural architecture that captures top-down cross-modal interactions, using a feedback mechanism in the forward pass during network training.
arXiv Detail & Related papers (2022-01-24T17:48:04Z)
Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data. The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale. Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z)
Deep Variational Models for Collaborative Filtering-based Recommender Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results. Our proposed models apply the variational concept to injectity in the latent space of the deep architecture. Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
We propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis. In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality. A multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality.
arXiv Detail & Related papers (2020-02-11T08:26:42Z)
Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation. In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI. We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.