PyHealth: A Python Library for Health Predictive Models
- URL: http://arxiv.org/abs/2101.04209v1
- Date: Mon, 11 Jan 2021 22:02:08 GMT
- Title: PyHealth: A Python Library for Health Predictive Models
- Authors: Yue Zhao, Zhi Qiao, Cao Xiao, Lucas Glass, Jimeng Sun
- Abstract summary: PyHealth is an open-source Python toolbox for developing various predictive models on healthcare data.
The data preprocessing module enables the transformation of complex healthcare datasets into machine learning friendly formats.
The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches.
- Score: 53.848478115284195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the explosion of interest in healthcare AI research, the
reproducibility and benchmarking of those research works are often limited due
to the lack of standard benchmark datasets and diverse evaluation metrics. To
address this reproducibility challenge, we develop PyHealth, an open-source
Python toolbox for developing various predictive models on healthcare data.
PyHealth consists of data preprocessing module, predictive modeling module,
and evaluation module. The target users of PyHealth are both computer science
researchers and healthcare data scientists. With PyHealth, they can conduct
complex machine learning pipelines on healthcare datasets with fewer than ten
lines of code. The data preprocessing module enables the transformation of
complex healthcare datasets such as longitudinal electronic health records,
medical images, continuous signals (e.g., electrocardiogram), and clinical
notes into machine learning friendly formats. The predictive modeling module
provides more than 30 machine learning models, including established ensemble
trees and deep neural network-based approaches, via a unified but extendable
API designed for both researchers and practitioners. The evaluation module
provides various evaluation strategies (e.g., cross-validation and
train-validation-test split) and predictive model metrics.
With robustness and scalability in mind, best practices such as unit testing,
continuous integration, code coverage, and interactive examples are introduced
in the library's development. PyHealth can be installed through the Python
Package Index (PyPI) or https://github.com/yzhao062/PyHealth .
Related papers
- pyvene: A Library for Understanding and Improving PyTorch Models via
Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules.
We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z) - PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time
Series [0.0]
PyPOTS is an open-source Python library dedicated to data mining and analysis on partially-observed time series.
It provides easy access to diverse algorithms categorized into four tasks: imputation, classification, clustering, and forecasting.
arXiv Detail & Related papers (2023-05-30T07:57:05Z) - TemporAI: Facilitating Machine Learning Innovation in Time Domain Tasks
for Medicine [91.3755431537592]
TemporAI is an open source Python software library for machine learning (ML) tasks involving data with a time component.
It supports data in time series, static, and eventmodalities and provides an interface for prediction, causal inference, and time-to-event analysis.
arXiv Detail & Related papers (2023-01-28T17:57:53Z) - DeeProb-kit: a Python Library for Deep Probabilistic Modelling [0.0]
DeeProb-kit is a unified library written in Python consisting of a collection of deep probabilistic models (DPMs)
It includes efficiently implemented learning techniques, inference routines, statistical algorithms, and provides high-quality fully-documented APIs.
arXiv Detail & Related papers (2022-12-08T17:02:16Z) - medigan: A Python Library of Pretrained Generative Models for Enriched
Data Access in Medical Imaging [3.8568465270960264]
medigan is a one-stop shop for pretrained generative models implemented as an open-source framework-agnostic Python library.
It allows researchers and developers to create, increase, and domain-adapt their training data in just a few lines of code.
The library's scalability and design is demonstrated by its growing number of integrated and readily-usable pretrained generative models.
arXiv Detail & Related papers (2022-09-28T23:45:33Z) - Latte: Cross-framework Python Package for Evaluation of Latent-Based
Generative Models [65.51757376525798]
Latte is a Python library for evaluation of latent-based generative models.
Latte is compatible with both PyTorch and/Keras, and provides both functional and modular APIs.
arXiv Detail & Related papers (2021-12-20T16:00:28Z) - Scikit-dimension: a Python package for intrinsic dimension estimation [58.8599521537]
This technical note introduces textttscikit-dimension, an open-source Python package for intrinsic dimension estimation.
textttscikit-dimension package provides a uniform implementation of most of the known ID estimators based on scikit-learn application programming interface.
We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation in real-life and synthetic data.
arXiv Detail & Related papers (2021-09-06T16:46:38Z) - pymia: A Python package for data handling and evaluation in deep
learning-based medical image analysis [0.9176056742068814]
pymia is an open-source Python package for data handling and evaluation in medical image analysis.
The package is highly flexible, allows for fast prototyping, and reduces the burden of implementing data handling routines.
pymia was successfully used in a variety of research projects for segmentation, reconstruction, and regression.
arXiv Detail & Related papers (2020-10-07T20:25:52Z) - Biomedical and Clinical English Model Packages in the Stanza Python NLP
Library [47.47381610312517]
We introduce biomedical and clinical English model packages for the Stanza Python NLP library.
These packages offer accurate syntactic analysis and named entity recognition capabilities for biomedical and clinical text.
We show via extensive experiments that our packages achieve syntactic analysis and named entity recognition performance that is on par with or surpasses state-of-the-art results.
arXiv Detail & Related papers (2020-07-29T07:27:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.