Fusion of medical imaging and electronic health records with attention
and multi-head machanisms
- URL: http://arxiv.org/abs/2112.11710v1
- Date: Wed, 22 Dec 2021 07:39:26 GMT
- Title: Fusion of medical imaging and electronic health records with attention
and multi-head machanisms
- Authors: Cheng Jiang, Yihao Chen, Jianbo Chang, Ming Feng, Renzhi Wang, Jianhua
Yao
- Abstract summary: We propose a multi-modal attention module which use EHR data to help the selection of important regions during image feature extraction process.
We also propose to incorporate multi-head machnism to gated multimodal unit (GMU) to make it able to parallelly fuse image and EHR features in different subspaces.
Experiments on predicting Glasgow outcome scale (GOS) of intracerebral hemorrhage patients and classifying Alzheimer's Disease showed the proposed method can automatically focus on task-related areas.
- Score: 4.433829714749366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Doctors often make diagonostic decisions based on patient's image scans, such
as magnetic resonance imaging (MRI), and patient's electronic health records
(EHR) such as age, gender, blood pressure and so on. Despite a lot of automatic
methods have been proposed for either image or text analysis in computer vision
or natural language research areas, much fewer studies have been developed for
the fusion of medical image and EHR data for medical problems. Among existing
early or intermediate fusion methods, concatenation of features from both
modalities is still a mainstream. For a better exploiting of image and EHR
data, we propose a multi-modal attention module which use EHR data to help the
selection of important regions during image feature extraction process
conducted by traditional CNN. Moreover, we propose to incorporate multi-head
machnism to gated multimodal unit (GMU) to make it able to parallelly fuse
image and EHR features in different subspaces. With the help of the two
modules, existing CNN architecture can be enhanced using both modalities.
Experiments on predicting Glasgow outcome scale (GOS) of intracerebral
hemorrhage patients and classifying Alzheimer's Disease showed the proposed
method can automatically focus on task-related areas and achieve better results
by making better use of image and EHR features.
Related papers
- Private, Efficient and Scalable Kernel Learning for Medical Image Analysis [1.7999333451993955]
OKRA (Orthonormal K-fRAmes) is a novel randomized encoding-based approach for kernel-based machine learning.
It significantly enhances scalability and speed compared to current state-of-the-art solutions.
arXiv Detail & Related papers (2024-10-21T10:03:03Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive Modeling [4.44283662576491]
We present a novel framework based on hypernetworks to fuse clinical imaging and tabular data by conditioning the image processing on the EHR's values and measurements.
We show that our framework outperforms both single-modality models and state-of-the-art MRI-tabular data fusion methods.
arXiv Detail & Related papers (2024-03-20T05:50:04Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Three-Dimensional Medical Image Fusion with Deformable Cross-Attention [10.26573411162757]
Multimodal medical image fusion plays an instrumental role in several areas of medical image processing.
Traditional fusion methods tend to process each modality independently before combining the features and reconstructing the fusion image.
In this study, we introduce an innovative unsupervised feature mutual learning fusion network designed to rectify these limitations.
arXiv Detail & Related papers (2023-10-10T04:10:56Z) - Multi-modal Graph Neural Network for Early Diagnosis of Alzheimer's
Disease from sMRI and PET Scans [11.420077093805382]
We propose to use graph neural networks (GNN) that are designed to deal with problems in non-Euclidean domains.
In this study, we demonstrate how brain networks can be created from sMRI or PET images.
We then present a multi-modal GNN framework where each modality has its own branch of GNN and a technique is proposed to combine the multi-modal data.
arXiv Detail & Related papers (2023-07-31T02:04:05Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Case Studies on X-Ray Imaging, MRI and Nuclear Imaging [0.0]
We will focus on how AI-based approaches, particularly the use of Convolutional Neural Networks (CNN), can assist in disease detection through medical imaging technology.
CNN is a commonly used approach for image analysis due to its ability to extract features from raw input images.
arXiv Detail & Related papers (2023-06-03T09:05:35Z) - AlignTransformer: Hierarchical Alignment of Visual Regions and Disease
Tags for Medical Report Generation [50.21065317817769]
We propose an AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that the AlignTransformer can achieve results competitive with state-of-the-art methods on the two datasets.
arXiv Detail & Related papers (2022-03-18T13:43:53Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z) - Universal Model for Multi-Domain Medical Image Retrieval [88.67940265012638]
Medical Image Retrieval (MIR) helps doctors quickly find similar patients' data.
MIR is becoming increasingly helpful due to the wide use of digital imaging modalities.
However, the popularity of various digital imaging modalities in hospitals also poses several challenges to MIR.
arXiv Detail & Related papers (2020-07-14T23:22:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.