SIBILA: A novel interpretable ensemble of general-purpose machine
learning models applied to medical contexts
- URL: http://arxiv.org/abs/2205.06234v2
- Date: Thu, 27 Apr 2023 15:25:13 GMT
- Title: SIBILA: A novel interpretable ensemble of general-purpose machine
learning models applied to medical contexts
- Authors: Antonio Jes\'us Banegas-Luna, Horacio P\'erez-S\'anchez
- Abstract summary: SIBILA is an ensemble of machine learning and deep learning models.
It applies a range of interpretability algorithms to identify the most relevant input features.
It has been applied to two medical case studies to show its ability to predict in classification problems.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personalized medicine remains a major challenge for scientists. The rapid
growth of Machine learning and Deep learning has made them a feasible al-
ternative for predicting the most appropriate therapy for individual patients.
However, the need to develop a custom model for every dataset, the lack of
interpretation of their results and high computational requirements make many
reluctant to use these methods. Aiming to save time and bring light to the way
models work internally, SIBILA has been developed. SIBILA is an ensemble of
machine learning and deep learning models that applies a range of
interpretability algorithms to identify the most relevant input features. Since
the interpretability algo- rithms may not be in line with each other, a
consensus stage has been imple- mented to estimate the global attribution of
each variable to the predictions. SIBILA is containerized to be run on any
high-performance computing plat- form. Although conceived as a command-line
tool, it is also available to all users free of charge as a web server at
https://bio-hpc.ucam.edu/sibila. Thus, even users with few technological skills
can take advantage of it. SIBILA has been applied to two medical case studies
to show its ability to predict in classification problems. Even though it is a
general-purpose tool, it has been developed with the aim of becoming a powerful
decision-making tool for clinicians, but can actually be used in many other
domains. Thus, other two non-medical examples are supplied as supplementary
material to prove that SIBILA still works well with noise and in regression
problems.
Related papers
- Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - EndToEndML: An Open-Source End-to-End Pipeline for Machine Learning Applications [0.2826977330147589]
We propose a web-based end-to-end pipeline that is capable of preprocessing, training, evaluating, and visualizing machine learning models.
Our library assists in recognizing, classifying, clustering, and predicting a wide range of multi-modal, multi-sensor datasets.
arXiv Detail & Related papers (2024-03-27T02:24:38Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models
in Medicine [55.29668193415034]
We present OpenMEDLab, an open-source platform for multi-modality foundation models.
It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications.
It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
arXiv Detail & Related papers (2024-02-28T03:51:02Z) - Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML [0.7982607013768545]
Yet Another ICU Benchmark (YAIB) is a modular framework that allows researchers to define reproducible and comparable clinical ML experiments.
YAIB supports most open-access ICU datasets (MIMIC III/IV, eICU, HiRID, AUMCdb) and is easily adaptable to future ICU datasets.
We demonstrate that the choice of dataset, cohort definition, and preprocessing have a major impact on the prediction performance.
arXiv Detail & Related papers (2023-06-08T11:16:20Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - When will the mist clear? On the Interpretability of Machine Learning
for Medical Applications: a survey [0.056212519098516295]
We analyse current machine learning models, frameworks, databases and other related tools as applied to medicine.
From the evidence available, ANN, LR and SVM have been observed to be the preferred models.
We discuss their interpretability, performance and the necessary input data.
arXiv Detail & Related papers (2020-10-01T12:42:06Z) - PHOTONAI -- A Python API for Rapid Machine Learning Model Development [2.414341608751139]
PHOTONAI is a high-level Python API designed to simplify and accelerate machine learning model development.
It functions as a unifying framework allowing the user to easily access and combine algorithms from different toolboxes into custom algorithm sequences.
arXiv Detail & Related papers (2020-02-13T10:33:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.