Empirical Study of Overfitting in Deep FNN Prediction Models for Breast
Cancer Metastasis
- URL: http://arxiv.org/abs/2208.02150v1
- Date: Wed, 3 Aug 2022 15:36:12 GMT
- Title: Empirical Study of Overfitting in Deep FNN Prediction Models for Breast
Cancer Metastasis
- Authors: Chuhan Xu, Pablo Coen-Pirani, Xia Jiang
- Abstract summary: Overfitting is the fact that the current model fits a specific data set perfectly, resulting in weakened generalization.
In this research we used an EHR dataset concerning breast cancer metastasis to study overfitting of deep feedforward Neural Networks (FNNs) prediction models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Overfitting is defined as the fact that the current model fits a specific
data set perfectly, resulting in weakened generalization, and ultimately may
affect the accuracy in predicting future data. In this research we used an EHR
dataset concerning breast cancer metastasis to study overfitting of deep
feedforward Neural Networks (FNNs) prediction models. We included 11
hyperparameters of the deep FNNs models and took an empirical approach to study
how each of these hyperparameters was affecting both the prediction performance
and overfitting when given a large range of values. We also studied how some of
the interesting pairs of hyperparameters were interacting to influence the
model performance and overfitting. The 11 hyperparameters we studied include
activate function; weight initializer, number of hidden layers, learning rate,
momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show
that most of the single hyperparameters are either negatively or positively
corrected with model prediction performance and overfitting. In particular, we
found that overfitting overall tends to negatively correlate with learning
rate, decay, batch sides, and L2, but tends to positively correlate with
momentum, epochs, and L1. According to our results, learning rate, decay, and
batch size may have a more significant impact on both overfitting and
prediction performance than most of the other hyperparameters, including L1,
L2, and dropout rate, which were designed for minimizing overfitting. We also
find some interesting interacting pairs of hyperparameters such as learning
rate and momentum, learning rate and decay, and batch size and epochs.
Keywords: Deep learning, overfitting, prediction, grid search, feedforward
neural networks, breast cancer metastasis.
Related papers
- Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons [2.77390041716769]
Kolmogorov-Arnold Networks (KANs) use highly flexible learnable activation functions directly on network edges.
KANs significantly increase the number of learnable parameters, raising concerns about their effectiveness in data-scarce environments.
We show that individualized activation functions achieve significantly higher predictive accuracy with only a modest increase in parameters.
arXiv Detail & Related papers (2024-09-16T16:56:08Z) - Deep Learning to Predict Late-Onset Breast Cancer Metastasis: the Single Hyperparameter Grid Search (SHGS) Strategy for Meta Tuning Concerning Deep Feed-forward Neural Network [7.332652485849632]
We have been dedicated to constructing a DFNN model to predict breast cancer metastasis n years in advance.
The challenge lies in efficiently identifying optimal hyperparameter values through grid search, given the constraints of time and resources.
arXiv Detail & Related papers (2024-08-28T03:00:43Z) - Predicting Multi-Joint Kinematics of the Upper Limb from EMG Signals
Across Varied Loads with a Physics-Informed Neural Network [0.0]
The PINN model is constructed by combining a feed-forward Artificial Neural Network (ANN) with a joint torque model.
The training dataset for the PINN model comprises EMG and time data collected from four different subjects.
The results demonstrated strong correlations of 58% to 83% in joint angle prediction.
arXiv Detail & Related papers (2023-11-28T16:55:11Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Improving Prediction of Cognitive Performance using Deep Neural Networks
in Sparse Data [2.867517731896504]
We used data from an observational, cohort study, Midlife in the United States (MIDUS) to model executive function and episodic memory measures.
Deep neural network (DNN) models consistently ranked highest in all of the cognitive performance prediction tasks.
arXiv Detail & Related papers (2021-12-28T22:23:08Z) - When in Doubt: Neural Non-Parametric Uncertainty Quantification for
Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions.
Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations.
We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Learning Curves for Drug Response Prediction in Cancer Cell Lines [29.107984441845673]
We evaluate the data scaling properties of two neural networks (NNs) and two gradient boosting decision tree (GBDT) models trained on four drug screening datasets.
The learning curves are accurately fitted to a power law model, providing a framework for assessing the data scaling behavior of these predictors.
arXiv Detail & Related papers (2020-11-25T01:08:05Z) - Assessing Graph-based Deep Learning Models for Predicting Flash Point [52.931492216239995]
Graph-based deep learning (GBDL) models were implemented in predicting flash point for the first time.
Average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3% lower and 2.0 K higher than previous comparable studies.
arXiv Detail & Related papers (2020-02-26T06:10:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.