Related papers: Learning to Adapt Clinical Sequences with Residual Mixture of Experts

Learning to Adapt Clinical Sequences with Residual Mixture of Experts

URL: http://arxiv.org/abs/2204.02687v1
Date: Wed, 6 Apr 2022 09:23:12 GMT
Title: Learning to Adapt Clinical Sequences with Residual Mixture of Experts
Authors: Jeong Min Lee and Milos Hauskrecht
Abstract summary: We propose a Mixture-of-Experts (MoE) architecture to represent complex dynamics of all patients. The architecture consists of multiple (expert) RNN models covering patient sub-populations and refining the predictions of the base model. We show 4.1% gain on AUPRC statistics compared to a single RNN prediction.
Score: 12.881413375147996
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Clinical event sequences in Electronic Health Records (EHRs) record detailed information about the patient condition and patient care as they occur in time. Recent years have witnessed increased interest of machine learning community in developing machine learning models solving different types of problems defined upon information in EHRs. More recently, neural sequential models, such as RNN and LSTM, became popular and widely applied models for representing patient sequence data and for predicting future events or outcomes based on such data. However, a single neural sequential model may not properly represent complex dynamics of all patients and the differences in their behaviors. In this work, we aim to alleviate this limitation by refining a one-fits-all model using a Mixture-of-Experts (MoE) architecture. The architecture consists of multiple (expert) RNN models covering patient sub-populations and refining the predictions of the base model. That is, instead of training expert RNN models from scratch we define them on the residual signal that attempts to model the differences from the population-wide model. The heterogeneity of various patient sequences is modeled through multiple experts that consist of RNN. Particularly, instead of directly training MoE from scratch, we augment MoE based on the prediction signal from pretrained base GRU model. With this way, the mixture of experts can provide flexible adaptation to the (limited) predictive power of the single base RNN model. We experiment with the newly proposed model on real-world EHRs data and the multivariate clinical event prediction task. We implement RNN using Gated Recurrent Units (GRU). We show 4.1% gain on AUPRC statistics compared to a single RNN prediction.

Related papers

The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
Online Evolutionary Neural Architecture Search for Multivariate Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm. ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks. Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z)
Satellite Anomaly Detection Using Variance Based Genetic Ensemble of Neural Networks [7.848121055546167]
We use an efficient ensemble of the predictions from multiple Recurrent Neural Networks (RNNs) For prediction, each RNN is guided by a Genetic Algorithm (GA) which constructs the optimal structure for each RNN model. This paper uses the Monte Carlo (MC) dropout as an approximation version of BNNs.
arXiv Detail & Related papers (2023-02-10T22:09:00Z)
Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation [7.2666838978096875]
Existing approaches typically train models in a patient-specific fashion due to the highly personalized characteristics of epileptic signals. A patient-specific model can then be obtained with the help of distilled knowledge and additional personalized data. Five state-of-the-art seizure prediction methods are trained on the CHB-MIT sEEG database with our proposed scheme.
arXiv Detail & Related papers (2022-02-25T10:30:29Z)
EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting. We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z)
Mixed Effects Neural ODE: A Variational Approximation for Analyzing the Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data. We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem. We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z)
Simple Recurrent Neural Networks is all we need for clinical events predictions using EHR data [22.81278657120305]
Recurrent neural networks (RNNs) are common architecture for EHR-based clinical events predictive models. We used two prediction tasks: the risk for developing heart failure and the risk of early readmission for inpatient hospitalization. We found that simple gated RNN models, including GRUs and LSTMs, often offer competitive results when properly tuned with Bayesian Optimization.
arXiv Detail & Related papers (2021-10-03T13:07:23Z)
Interpretable Additive Recurrent Neural Networks For Multivariate Clinical Time Series [4.125698836261585]
We present the Interpretable-RNN (I-RNN) that balances model complexity and accuracy by forcing the relationship between variables in the model to be additive. I-RNN specifically captures the unique characteristics of clinical time series, which are unevenly sampled in time, asynchronously acquired, and have missing data. We evaluate the I-RNN model on the Physionet 2012 Challenge dataset to predict in-hospital mortality, and on a real-world clinical decision support task: predicting hemodynamic interventions in the intensive care unit.
arXiv Detail & Related papers (2021-09-15T22:30:19Z)
Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug Response [49.86828302591469]
In this paper, we apply transfer learning to the prediction of anti-cancer drug response. We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset. The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures.
arXiv Detail & Related papers (2020-05-13T20:29:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.