Related papers: Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis

Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis

URL: http://arxiv.org/abs/2208.02150v1
Date: Wed, 3 Aug 2022 15:36:12 GMT
Title: Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis
Authors: Chuhan Xu, Pablo Coen-Pirani, Xia Jiang
Abstract summary: Overfitting is the fact that the current model fits a specific data set perfectly, resulting in weakened generalization. In this research we used an EHR dataset concerning breast cancer metastasis to study overfitting of deep feedforward Neural Networks (FNNs) prediction models.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Overfitting is defined as the fact that the current model fits a specific data set perfectly, resulting in weakened generalization, and ultimately may affect the accuracy in predicting future data. In this research we used an EHR dataset concerning breast cancer metastasis to study overfitting of deep feedforward Neural Networks (FNNs) prediction models. We included 11 hyperparameters of the deep FNNs models and took an empirical approach to study how each of these hyperparameters was affecting both the prediction performance and overfitting when given a large range of values. We also studied how some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied include activate function; weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch sides, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs. Keywords: Deep learning, overfitting, prediction, grid search, feedforward neural networks, breast cancer metastasis.

Related papers

Comparing and Scaling fMRI Features for Brain-Behavior Prediction [1.0832932170181544]
We study 979 subjects from the Human Connectome Project Young Adult dataset.<n>We predict summary scores for mental health, cognition, processing speed, and substance use, as well as age and sex.<n>FC comes out as the best feature for predicting cognition, age, and sex.
arXiv Detail & Related papers (2025-07-28T08:13:08Z)
Prediction of Lung Metastasis from Hepatocellular Carcinoma using the SEER Database [0.9055332067000195]
Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality. predictive models for lung metastasis inHCC remain limited in scope and clinical applicability. We develop and validate an end-to-end machine learning pipeline using data from the Surveillance, Epidemiology, and End Results (SEER) database.
arXiv Detail & Related papers (2025-01-20T20:06:31Z)
Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons [2.77390041716769]
Kolmogorov-Arnold Networks (KANs) use highly flexible learnable activation functions directly on network edges. KANs significantly increase the number of learnable parameters, raising concerns about their effectiveness in data-scarce environments. We show that individualized activation functions achieve significantly higher predictive accuracy with only a modest increase in parameters.
arXiv Detail & Related papers (2024-09-16T16:56:08Z)
Deep Learning to Predict Late-Onset Breast Cancer Metastasis: the Single Hyperparameter Grid Search (SHGS) Strategy for Meta Tuning Concerning Deep Feed-forward Neural Network [7.332652485849632]
We have been dedicated to constructing a DFNN model to predict breast cancer metastasis n years in advance. The challenge lies in efficiently identifying optimal hyperparameter values through grid search, given the constraints of time and resources.
arXiv Detail & Related papers (2024-08-28T03:00:43Z)
Predicting Multi-Joint Kinematics of the Upper Limb from EMG Signals Across Varied Loads with a Physics-Informed Neural Network [0.0]
The PINN model is constructed by combining a feed-forward Artificial Neural Network (ANN) with a joint torque model. The training dataset for the PINN model comprises EMG and time data collected from four different subjects. The results demonstrated strong correlations of 58% to 83% in joint angle prediction.
arXiv Detail & Related papers (2023-11-28T16:55:11Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
Improving Prediction of Cognitive Performance using Deep Neural Networks in Sparse Data [2.867517731896504]
We used data from an observational, cohort study, Midlife in the United States (MIDUS) to model executive function and episodic memory measures. Deep neural network (DNN) models consistently ranked highest in all of the cognitive performance prediction tasks.
arXiv Detail & Related papers (2021-12-28T22:23:08Z)
When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions. Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations. We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Learning Curves for Drug Response Prediction in Cancer Cell Lines [29.107984441845673]
We evaluate the data scaling properties of two neural networks (NNs) and two gradient boosting decision tree (GBDT) models trained on four drug screening datasets. The learning curves are accurately fitted to a power law model, providing a framework for assessing the data scaling behavior of these predictors.
arXiv Detail & Related papers (2020-11-25T01:08:05Z)
Assessing Graph-based Deep Learning Models for Predicting Flash Point [52.931492216239995]
Graph-based deep learning (GBDL) models were implemented in predicting flash point for the first time. Average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3% lower and 2.0 K higher than previous comparable studies.
arXiv Detail & Related papers (2020-02-26T06:10:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.