Learning to Adapt Clinical Sequences with Residual Mixture of Experts
- URL: http://arxiv.org/abs/2204.02687v1
- Date: Wed, 6 Apr 2022 09:23:12 GMT
- Title: Learning to Adapt Clinical Sequences with Residual Mixture of Experts
- Authors: Jeong Min Lee and Milos Hauskrecht
- Abstract summary: We propose a Mixture-of-Experts (MoE) architecture to represent complex dynamics of all patients.
The architecture consists of multiple (expert) RNN models covering patient sub-populations and refining the predictions of the base model.
We show 4.1% gain on AUPRC statistics compared to a single RNN prediction.
- Score: 12.881413375147996
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Clinical event sequences in Electronic Health Records (EHRs) record detailed
information about the patient condition and patient care as they occur in time.
Recent years have witnessed increased interest of machine learning community in
developing machine learning models solving different types of problems defined
upon information in EHRs. More recently, neural sequential models, such as RNN
and LSTM, became popular and widely applied models for representing patient
sequence data and for predicting future events or outcomes based on such data.
However, a single neural sequential model may not properly represent complex
dynamics of all patients and the differences in their behaviors. In this work,
we aim to alleviate this limitation by refining a one-fits-all model using a
Mixture-of-Experts (MoE) architecture. The architecture consists of multiple
(expert) RNN models covering patient sub-populations and refining the
predictions of the base model. That is, instead of training expert RNN models
from scratch we define them on the residual signal that attempts to model the
differences from the population-wide model. The heterogeneity of various
patient sequences is modeled through multiple experts that consist of RNN.
Particularly, instead of directly training MoE from scratch, we augment MoE
based on the prediction signal from pretrained base GRU model. With this way,
the mixture of experts can provide flexible adaptation to the (limited)
predictive power of the single base RNN model. We experiment with the newly
proposed model on real-world EHRs data and the multivariate clinical event
prediction task. We implement RNN using Gated Recurrent Units (GRU). We show
4.1% gain on AUPRC statistics compared to a single RNN prediction.
Related papers
- Physics-informed deep learning for infectious disease forecasting [2.938382489367582]
We propose a new infectious disease forecasting model based on physics-informed neural networks (PINNs)
The proposed PINN model incorporates dynamical systems representations of disease transmission into the loss function.
Predictions of PINN model on the number of cases, deaths, and hospitalizations are consistent with existing benchmarks.
arXiv Detail & Related papers (2025-01-16T05:07:05Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Satellite Anomaly Detection Using Variance Based Genetic Ensemble of
Neural Networks [7.848121055546167]
We use an efficient ensemble of the predictions from multiple Recurrent Neural Networks (RNNs)
For prediction, each RNN is guided by a Genetic Algorithm (GA) which constructs the optimal structure for each RNN model.
This paper uses the Monte Carlo (MC) dropout as an approximation version of BNNs.
arXiv Detail & Related papers (2023-02-10T22:09:00Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Simple Recurrent Neural Networks is all we need for clinical events
predictions using EHR data [22.81278657120305]
Recurrent neural networks (RNNs) are common architecture for EHR-based clinical events predictive models.
We used two prediction tasks: the risk for developing heart failure and the risk of early readmission for inpatient hospitalization.
We found that simple gated RNN models, including GRUs and LSTMs, often offer competitive results when properly tuned with Bayesian Optimization.
arXiv Detail & Related papers (2021-10-03T13:07:23Z) - Interpretable Additive Recurrent Neural Networks For Multivariate
Clinical Time Series [4.125698836261585]
We present the Interpretable-RNN (I-RNN) that balances model complexity and accuracy by forcing the relationship between variables in the model to be additive.
I-RNN specifically captures the unique characteristics of clinical time series, which are unevenly sampled in time, asynchronously acquired, and have missing data.
We evaluate the I-RNN model on the Physionet 2012 Challenge dataset to predict in-hospital mortality, and on a real-world clinical decision support task: predicting hemodynamic interventions in the intensive care unit.
arXiv Detail & Related papers (2021-09-15T22:30:19Z) - Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug
Response [49.86828302591469]
In this paper, we apply transfer learning to the prediction of anti-cancer drug response.
We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset.
The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures.
arXiv Detail & Related papers (2020-05-13T20:29:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.