CliMB: An AI-enabled Partner for Clinical Predictive Modeling
- URL: http://arxiv.org/abs/2410.03736v2
- Date: Mon, 25 Nov 2024 16:21:05 GMT
- Title: CliMB: An AI-enabled Partner for Clinical Predictive Modeling
- Authors: Evgeny Saveliev, Tim Schubert, Thomas Pouplin, Vasilis Kosmoliaptsis, Mihaela van der Schaar,
- Abstract summary: CliMB is a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language.
CliMB guides clinician scientists through the entire medical data science pipeline.
CliMB consistently demonstrated superior performance in key areas such as planning, error prevention, code execution, and model performance.
- Score: 42.32743590150279
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite its significant promise and continuous technical advances, real-world applications of artificial intelligence (AI) remain limited. We attribute this to the "domain expert-AI-conundrum": while domain experts, such as clinician scientists, should be able to build predictive models such as risk scores, they face substantial barriers in accessing state-of-the-art (SOTA) tools. While automated machine learning (AutoML) has been proposed as a partner in clinical predictive modeling, many additional requirements need to be fulfilled to make machine learning accessible for clinician scientists. To address this gap, we introduce CliMB, a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language. CliMB guides clinician scientists through the entire medical data science pipeline, thus empowering them to create predictive models from real-world data in just one conversation. CliMB also creates structured reports and interpretable visuals. In evaluations involving clinician scientists and systematic comparisons against a baseline GPT-4, CliMB consistently demonstrated superior performance in key areas such as planning, error prevention, code execution, and model performance. Moreover, in blinded assessments involving 45 clinicians from diverse specialties and career stages, more than 80% preferred CliMB over GPT-4. Overall, by providing a no-code interface with clear guidance and access to SOTA methods in the fields of data-centric AI, AutoML, and interpretable ML, CliMB empowers clinician scientists to build robust predictive models. The proof-of-concept version of CliMB is available as open-source software on GitHub: https://github.com/vanderschaarlab/climb.
Related papers
- MedGPT-oss: Training a General-Purpose Vision-Language Model for Biomedicine [38.06252990946545]
We introduce MEDGPT-OSS, an open-weight, 20B- parameter vision-language model designed to facilitate open research in clinical AI.<n>Rather than relying on architectural complexity, MEDGPT-OSS pairs the GPT-oss language backbone with a visual front-end via a optimized, three-stage training curriculum.<n>It successfully outperforms larger open medical models on out-of-distribution multimodal reasoning and complex text-only clinical tasks.
arXiv Detail & Related papers (2026-03-01T00:06:43Z) - Patient Digital Twins for Chronic Care: Technical Hurdles, Lessons Learned, and the Road Ahead [0.0]
Chronic diseases constitute the principal burden of morbidity, mortality and healthcare costs worldwide.<n>Patient Medical Digital Twins (PMDTs) offer a paradigm shift: holistic, continuously updated digital counterparts of patients that integrate clinical, genomic, lifestyle, and quality-of-life data.
arXiv Detail & Related papers (2026-02-11T13:07:00Z) - A Model-Driven Engineering Approach to AI-Powered Healthcare Platforms [0.03262230127283451]
We introduce a model driven engineering (MDE) framework designed specifically for healthcare AI.<n>The framework relies on formal metamodels, domain-specific languages, and automated transformations to move from high level specifications to running software.<n>We evaluate this approach in a multi center cancer immunotherapy study.
arXiv Detail & Related papers (2025-10-10T12:00:12Z) - A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI [70.06771291117965]
We introduce Biomedica, an open-source dataset derived from the PubMed Central Open Access subset.
Biomedica contains over 6 million scientific articles and 24 million image-text pairs.
We provide scalable streaming and search APIs through a web server, facilitating seamless integration with AI systems.
arXiv Detail & Related papers (2025-03-26T05:56:46Z) - Launching Insights: A Pilot Study on Leveraging Real-World Observational Data from the Mayo Clinic Platform to Advance Clinical Research [15.04629464273677]
The Mayo Clinic Platform (MCP) was established to address challenges by providing a scalable ecosystem to support clinical research and AI development.
We conducted four research projects leveraging MCP's data infrastructure and analytical capabilities to demonstrate its potential in facilitating real-world evidence generation and AI-driven clinical insights.
arXiv Detail & Related papers (2025-03-21T16:06:21Z) - From large language models to multimodal AI: A scoping review on the potential of generative AI in medicine [40.23383597339471]
multimodal AI is capable of integrating diverse data modalities, including imaging, text, and structured data, within a single model.
This scoping review explores the evolution of multimodal AI, highlighting its methods, applications, datasets, and evaluation in clinical settings.
Our findings underscore a shift from unimodal to multimodal approaches, driving innovations in diagnostic support, medical report generation, drug discovery, and conversational AI.
arXiv Detail & Related papers (2025-02-13T11:57:51Z) - Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models.
Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology [0.6397820821509177]
We introduce an alternative approach to multimodal medical AI that utilizes the generalist capabilities of a large language model (LLM) as a central reasoning engine.
This engine autonomously coordinates and deploys a set of specialized medical AI tools.
We show that the system has a high capability in employing appropriate tools (97%), drawing correct conclusions (93.6%), and providing complete (94%), and helpful (89.2%) recommendations for individual patient cases.
arXiv Detail & Related papers (2024-04-06T15:50:19Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models
in Medicine [55.29668193415034]
We present OpenMEDLab, an open-source platform for multi-modality foundation models.
It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications.
It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
arXiv Detail & Related papers (2024-02-28T03:51:02Z) - Deployment of a Robust and Explainable Mortality Prediction Model: The
COVID-19 Pandemic and Beyond [0.59374762912328]
This study investigated the performance, explainability, and robustness of deployed artificial intelligence (AI) models in predicting mortality during the COVID-19 pandemic and beyond.
arXiv Detail & Related papers (2023-11-28T18:15:53Z) - AutoPrognosis 2.0: Democratizing Diagnostic and Prognostic Modeling in
Healthcare with Automated Machine Learning [72.2614468437919]
We present a machine learning framework, AutoPrognosis 2.0, to develop diagnostic and prognostic models.
We provide an illustrative application where we construct a prognostic risk score for diabetes using the UK Biobank.
Our risk score has been implemented as a web-based decision support tool and can be publicly accessed by patients and clinicians worldwide.
arXiv Detail & Related papers (2022-10-21T16:31:46Z) - FedStack: Personalized activity monitoring using stacked federated
learning [12.792461572028449]
Federated learning is a relatively new AI technique designed to enhance data privacy.
Traditional federated learning requires identical architectural models to be trained across the local clients and global servers.
This work offers a protected privacy system for hospitalized in-patients in a decentralized approach.
arXiv Detail & Related papers (2022-09-27T00:12:44Z) - Privacy-Preserving Technology to Help Millions of People: Federated
Prediction Model for Stroke Prevention [25.276264953982253]
Our scientists and engineers propose a privacy-preserving scheme to predict the risk of stroke and deploy our federated prediction model on cloud servers.
Our model trains over all the healthcare data from hospitals in a certain city without actual data sharing among them.
Especially for small hospitals with few confirmed stroke cases, our federated model boosts model performance by 10%20% in several machine learning metrics.
arXiv Detail & Related papers (2020-06-15T08:51:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.