Comparison of deep learning and conventional methods for disease onset prediction
- URL: http://arxiv.org/abs/2410.10505v1
- Date: Mon, 14 Oct 2024 13:46:59 GMT
- Title: Comparison of deep learning and conventional methods for disease onset prediction
- Authors: Luis H. John, Chungsoo Kim, Jan A. Kors, Junhyuk Chang, Hannah Morgan-Cooper, Priya Desai, Chao Pang, Peter R. Rijnbeek, Jenna M. Reps, Egill A. Fridgeirsson,
- Abstract summary: Deep learning methods promise enhanced prediction performance by extracting complex patterns from clinical data.
This study compares conventional and deep learning approaches to predict lung cancer, dementia, and bipolar disorder.
- Score: 7.477956812298417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Background: Conventional prediction methods such as logistic regression and gradient boosting have been widely utilized for disease onset prediction for their reliability and interpretability. Deep learning methods promise enhanced prediction performance by extracting complex patterns from clinical data, but face challenges like data sparsity and high dimensionality. Methods: This study compares conventional and deep learning approaches to predict lung cancer, dementia, and bipolar disorder using observational data from eleven databases from North America, Europe, and Asia. Models were developed using logistic regression, gradient boosting, ResNet, and Transformer, and validated both internally and externally across the data sources. Discrimination performance was assessed using AUROC, and calibration was evaluated using Eavg. Findings: Across 11 datasets, conventional methods generally outperformed deep learning methods in terms of discrimination performance, particularly during external validation, highlighting their better transportability. Learning curves suggest that deep learning models require substantially larger datasets to reach the same performance levels as conventional methods. Calibration performance was also better for conventional methods, with ResNet showing the poorest calibration. Interpretation: Despite the potential of deep learning models to capture complex patterns in structured observational healthcare data, conventional models remain highly competitive for disease onset prediction, especially in scenarios involving smaller datasets and if lengthy training times need to be avoided. The study underscores the need for future research focused on optimizing deep learning models to handle the sparsity, high dimensionality, and heterogeneity inherent in healthcare datasets, and find new strategies to exploit the full capabilities of deep learning methods.
Related papers
- MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Comparative Study of Predicting Stock Index Using Deep Learning Models [0.0]
This study evaluates traditional forecasting methods, such as ARIMA, SARIMA, and SARIMAX, and newer neural network approaches, such as DF-RNN, DSSM, and Deep AR.
Results show that Deep AR outperformed all other conventional deep learning and traditional approaches, with the lowest MAPE of 0.01 and RMSE of 189.
arXiv Detail & Related papers (2023-06-24T10:38:08Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility.
Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z) - Augmentation based unsupervised domain adaptation [2.304713283039168]
Deep learning models trained on small and unrepresentative data tend to outperform when deployed in data that differs from the one used for training.
Our approach takes advantage of the properties of adversarial domain adaptation and consistency training to achieve more robust adaptation.
arXiv Detail & Related papers (2022-02-23T13:06:07Z) - A comparison of approaches to improve worst-case predictive model
performance over patient subpopulations [14.175321968797252]
Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations.
We identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations.
We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures.
arXiv Detail & Related papers (2021-08-27T13:10:00Z) - Semantic Perturbations with Normalizing Flows for Improved
Generalization [62.998818375912506]
We show that perturbations in the latent space can be used to define fully unsupervised data augmentations.
We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
arXiv Detail & Related papers (2021-08-18T03:20:00Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Unsupervised anomaly detection in digital pathology using GANs [4.318555434063274]
We propose a new unsupervised learning approach for anomaly detection in histopathology data based on generative adversarial networks (GANs)
Compared to the existing GAN-based methods that have been used in medical imaging, the proposed approach improves significantly on performance for pathology data.
arXiv Detail & Related papers (2021-03-16T10:10:12Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.