OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models
in Medicine
- URL: http://arxiv.org/abs/2402.18028v2
- Date: Mon, 4 Mar 2024 02:22:58 GMT
- Title: OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models
in Medicine
- Authors: Xiaosong Wang and Xiaofan Zhang and Guotai Wang and Junjun He and
Zhongyu Li and Wentao Zhu and Yi Guo and Qi Dou and Xiaoxiao Li and Dequan
Wang and Liang Hong and Qicheng Lao and Tong Ruan and Yukun Zhou and Yixue Li
and Jie Zhao and Kang Li and Xin Sun and Lifeng Zhu and Shaoting Zhang
- Abstract summary: We present OpenMEDLab, an open-source platform for multi-modality foundation models.
It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications.
It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
- Score: 55.29668193415034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emerging trend of advancing generalist artificial intelligence, such as
GPTv4 and Gemini, has reshaped the landscape of research (academia and
industry) in machine learning and many other research areas. However,
domain-specific applications of such foundation models (e.g., in medicine)
remain untouched or often at their very early stages. It will require an
individual set of transfer learning and model adaptation techniques by further
expanding and injecting these models with domain knowledge and data. The
development of such technologies could be largely accelerated if the bundle of
data, algorithms, and pre-trained foundation models were gathered together and
open-sourced in an organized manner. In this work, we present OpenMEDLab, an
open-source platform for multi-modality foundation models. It encapsulates not
only solutions of pioneering attempts in prompting and fine-tuning large
language and vision models for frontline clinical and bioinformatic
applications but also building domain-specific foundation models with
large-scale multi-modal medical data. Importantly, it opens access to a group
of pre-trained foundation models for various medical image modalities, clinical
text, protein engineering, etc. Inspiring and competitive results are also
demonstrated for each collected approach and model in a variety of benchmarks
for downstream tasks. We welcome researchers in the field of medical artificial
intelligence to continuously contribute cutting-edge methods and models to
OpenMEDLab, which can be accessed via https://github.com/openmedlab.
Related papers
- The Era of Foundation Models in Medical Imaging is Approaching : A Scoping Review of the Clinical Value of Large-Scale Generative AI Applications in Radiology [0.0]
Social problems stemming from the shortage of radiologists are intensifying, and artificial intelligence is being highlighted as a potential solution.
Recently emerging large-scale generative AI has expanded from large language models (LLMs) to multi-modal models.
This scoping review systematically organizes existing literature on the clinical value of large-scale generative AI applications.
arXiv Detail & Related papers (2024-09-03T00:48:50Z) - FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models [54.09244105445476]
This study introduces a novel knowledge injection approach, FedKIM, to scale the medical foundation model within a federated learning framework.
FedKIM leverages lightweight local models to extract healthcare knowledge from private data and integrates this knowledge into a centralized foundation model.
Our experiments across twelve tasks in seven modalities demonstrate the effectiveness of FedKIM in various settings.
arXiv Detail & Related papers (2024-08-17T15:42:29Z) - Automated Ensemble Multimodal Machine Learning for Healthcare [52.500923923797835]
We introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning.
AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies.
arXiv Detail & Related papers (2024-07-25T17:46:38Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture.
We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Foundational Models in Medical Imaging: A Comprehensive Survey and
Future Vision [6.2847894163744105]
Foundation models are large-scale, pre-trained deep-learning models adapted to a wide range of downstream tasks.
These models facilitate contextual reasoning, generalization, and prompt capabilities at test time.
Capitalizing on the advances in computer vision, medical imaging has also marked a growing interest in these models.
arXiv Detail & Related papers (2023-10-28T12:08:12Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.