Integrated multimodal artificial intelligence framework for healthcare
applications
- URL: http://arxiv.org/abs/2202.12998v4
- Date: Mon, 26 Sep 2022 19:00:17 GMT
- Title: Integrated multimodal artificial intelligence framework for healthcare
applications
- Authors: Luis R. Soenksen, Yu Ma, Cynthia Zeng, Leonard D.J. Boussioux,
Kimberly Villalobos Carballo, Liangyuan Na, Holly M. Wiberg, Michael L. Li,
Ignacio Fuentes, Dimitris Bertsimas
- Abstract summary: We propose and evaluate a unified Holistic AI in Medicine framework to facilitate the generation and testing of AI systems that leverage multimodal inputs.
Our approach uses generalizable data pre-processing and machine learning modeling stages that can be readily adapted for research and deployment in healthcare environments.
We show that this framework can consistently and robustly produce models that outperform similar single-source approaches across various healthcare demonstrations.
- Score: 3.6222901399459215
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Artificial intelligence (AI) systems hold great promise to improve healthcare
over the next decades. Specifically, AI systems leveraging multiple data
sources and input modalities are poised to become a viable method to deliver
more accurate results and deployable pipelines across a wide range of
applications. In this work, we propose and evaluate a unified Holistic AI in
Medicine (HAIM) framework to facilitate the generation and testing of AI
systems that leverage multimodal inputs. Our approach uses generalizable data
pre-processing and machine learning modeling stages that can be readily adapted
for research and deployment in healthcare environments. We evaluate our HAIM
framework by training and characterizing 14,324 independent models based on
HAIM-MIMIC-MM, a multimodal clinical database (N=34,537 samples) containing
7,279 unique hospitalizations and 6,485 patients, spanning all possible input
combinations of 4 data modalities (i.e., tabular, time-series, text, and
images), 11 unique data sources and 12 predictive tasks. We show that this
framework can consistently and robustly produce models that outperform similar
single-source approaches across various healthcare demonstrations (by 6-33%),
including 10 distinct chest pathology diagnoses, along with length-of-stay and
48-hour mortality predictions. We also quantify the contribution of each
modality and data source using Shapley values, which demonstrates the
heterogeneity in data modality importance and the necessity of multimodal
inputs across different healthcare-relevant tasks. The generalizable properties
and flexibility of our Holistic AI in Medicine (HAIM) framework could offer a
promising pathway for future multimodal predictive systems in clinical and
operational healthcare settings.
Related papers
- EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling [22.94521527609479]
EMERGE is a Retrieval-Augmented Generation driven framework aimed at enhancing multimodal EHR predictive modeling.
Our approach extracts entities from both time-series data and clinical notes by prompting Large Language Models.
The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses.
arXiv Detail & Related papers (2024-05-27T10:53:15Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - HEALNet -- Hybrid Multi-Modal Fusion for Heterogeneous Biomedical Data [12.109041184519281]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multi-modal fusion architecture.
We conduct multi-modal survival analysis on Whole Slide Images and Multi-omic data on four cancer cohorts of The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance, substantially improving over both uni-modal and recent multi-modal baselines.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Building Flexible, Scalable, and Machine Learning-ready Multimodal
Oncology Datasets [17.774341783844026]
This work proposes Multimodal Integration of Oncology Data System (MINDS)
MINDS is a flexible, scalable, and cost-effective metadata framework for efficiently fusing disparate data from public sources.
By harmonizing multimodal data, MINDS aims to potentially empower researchers with greater analytical ability.
arXiv Detail & Related papers (2023-09-30T15:44:39Z) - BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Patchwork Learning: A Paradigm Towards Integrative Analysis across
Diverse Biomedical Data Sources [40.32772510980854]
"patchwork learning" (PL) is a paradigm that integrates information from disparate datasets composed of different data modalities.
PL allows the simultaneous utilization of complementary data sources while preserving data privacy.
We present the concept of patchwork learning and its current implementations in healthcare, exploring the potential opportunities and applicable data sources.
arXiv Detail & Related papers (2023-05-10T14:50:33Z) - Artificial Intelligence-Based Methods for Fusion of Electronic Health
Records and Imaging Data [0.9749560288448113]
We focus on synthesizing and analyzing the literature that uses AI techniques to fuse multimodal medical data for different clinical applications.
We present a comprehensive analysis of the various fusion strategies, the diseases and clinical outcomes for which multimodal fusion was used, and the available multimodal medical datasets.
arXiv Detail & Related papers (2022-10-23T07:13:37Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.