Hierarchical Transformer Networks for Longitudinal Clinical Document
Classification
- URL: http://arxiv.org/abs/2104.08444v1
- Date: Sat, 17 Apr 2021 04:45:52 GMT
- Title: Hierarchical Transformer Networks for Longitudinal Clinical Document
Classification
- Authors: Yuqi Si and Kirk Roberts
- Abstract summary: The network is equipped with three levels of Transformer-based encoders to learn progressively from words to sentences, sentences to notes, and finally notes to patients.
Compared to traditional BERT models, our model increases the maximum input length from 512 words to much longer sequences that are appropriate for long sequences of clinical notes.
Our experimental results on the MIMIC-III dataset for different prediction tasks demonstrate that our proposed hierarchical model outperforms previous state-of-the-art hierarchical neural networks.
- Score: 5.670490259188555
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present the Hierarchical Transformer Networks for modeling long-term
dependencies across clinical notes for the purpose of patient-level prediction.
The network is equipped with three levels of Transformer-based encoders to
learn progressively from words to sentences, sentences to notes, and finally
notes to patients. The first level from word to sentence directly applies a
pre-trained BERT model, and the second and third levels both implement a stack
of 2-layer encoders before the final patient representation is fed into the
classification layer for clinical predictions. Compared to traditional BERT
models, our model increases the maximum input length from 512 words to much
longer sequences that are appropriate for long sequences of clinical notes. We
empirically examine and experiment with different parameters to identify an
optimal trade-off given computational resource limits. Our experimental results
on the MIMIC-III dataset for different prediction tasks demonstrate that our
proposed hierarchical model outperforms previous state-of-the-art hierarchical
neural networks.
Related papers
- Early Prediction of Causes (not Effects) in Healthcare by Long-Term Clinical Time Series Forecasting [11.96384267146423]
We propose to directly predict the causes via time series forecasting (TSF) of clinical variables.
Because model training does not rely on a particular label anymore, the forecasted data can be used to predict any consensus-based label.
arXiv Detail & Related papers (2024-08-07T14:52:06Z) - Predicting Infant Brain Connectivity with Federated Multi-Trajectory
GNNs using Scarce Data [54.55126643084341]
Existing deep learning solutions suffer from three major limitations.
We introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network.
Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets.
arXiv Detail & Related papers (2024-01-01T10:20:01Z) - On Preserving the Knowledge of Long Clinical Texts [0.0]
A bottleneck in using transformer encoders for processing clinical texts comes from the input length limit of these models.
This paper proposes a novel method to preserve the knowledge of long clinical texts in the models using aggregated ensembles of transformer encoders.
arXiv Detail & Related papers (2023-11-02T19:50:02Z) - Predicting Transcription Factor Binding Sites using Transformer based
Capsule Network [0.8793721044482612]
Prediction of binding sites for transcription factors is important to understand how they regulate gene expression and how this regulation can be modulated for therapeutic purposes.
DNABERT-Cap is a bidirectional encoder pre-trained with large number of genomic DNA sequences, empowered with a capsule layer responsible for the final prediction.
DNABERT-Cap is also compared with existing state-of-the-art deep learning based predictors viz. DeepARC, DeepTF, CNN-Zeng and DeepBind, and is seen to outperform them.
arXiv Detail & Related papers (2023-10-23T09:08:57Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Hierarchical Label-wise Attention Transformer Model for Explainable ICD
Coding [10.387366211090734]
We propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents.
We evaluate HiLAT using hospital discharge summaries and their corresponding ICD-9 codes from the MIMIC-III database.
Visualisations of attention weights present a potential explainability tool for checking the face validity of ICD code predictions.
arXiv Detail & Related papers (2022-04-22T14:12:22Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Interpreting Deep Learning Models for Epileptic Seizure Detection on EEG
signals [4.748221780751802]
Deep Learning (DL) is often considered the state-of-the art for Artificial Intelligence-based medical decision support.
It remains sparsely implemented in clinical practice and poorly trusted by clinicians due to insufficient interpretability of neural network models.
We have tackled this issue by developing interpretable DL models in the context of online detection of epileptic seizure, based on EEG signal.
arXiv Detail & Related papers (2020-12-22T11:10:23Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z) - Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug
Response [49.86828302591469]
In this paper, we apply transfer learning to the prediction of anti-cancer drug response.
We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset.
The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures.
arXiv Detail & Related papers (2020-05-13T20:29:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.