A Statistical Framework for Model Selection in LSTM Networks
- URL: http://arxiv.org/abs/2506.06840v1
- Date: Sat, 07 Jun 2025 15:44:27 GMT
- Title: A Statistical Framework for Model Selection in LSTM Networks
- Authors: Fahad Mostafa,
- Abstract summary: We propose a unified statistical framework for systematic model selection in LSTM networks.<n>Our framework extends classical model selection ideas, such as information criteria and shrinkage estimation, to sequential neural networks.<n>Several biomedical data centric examples demonstrate the flexibility and improved performance of the proposed framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Long Short-Term Memory (LSTM) neural network models have become the cornerstone for sequential data modeling in numerous applications, ranging from natural language processing to time series forecasting. Despite their success, the problem of model selection, including hyperparameter tuning, architecture specification, and regularization choice remains largely heuristic and computationally expensive. In this paper, we propose a unified statistical framework for systematic model selection in LSTM networks. Our framework extends classical model selection ideas, such as information criteria and shrinkage estimation, to sequential neural networks. We define penalized likelihoods adapted to temporal structures, propose a generalized threshold approach for hidden state dynamics, and provide efficient estimation strategies using variational Bayes and approximate marginal likelihood methods. Several biomedical data centric examples demonstrate the flexibility and improved performance of the proposed framework.
Related papers
- Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships.<n>Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z) - Deep Learning-based Approaches for State Space Models: A Selective Review [15.295157876811066]
State-space models (SSMs) offer a powerful framework for dynamical system analysis.<n>This paper provides a selective review of recent advancements in deep neural network-based approaches for SSMs.
arXiv Detail & Related papers (2024-12-15T15:04:35Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Embedded feature selection in LSTM networks with multi-objective
evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks.
Our approach optimize the weights and biases of the LSTM in a partitioned manner.
Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z) - Data-driven Preference Learning Methods for Sorting Problems with
Multiple Temporal Criteria [17.673512636899076]
This study presents novel preference learning approaches to multiple criteria sorting problems in the presence of temporal criteria.
To enhance scalability and accommodate learnable time discount factors, we introduce a novel monotonic Recurrent Neural Network (mRNN)
The proposed mRNN can describe the preference dynamics by depicting marginal value functions and personalized time discount factors along with time.
arXiv Detail & Related papers (2023-09-22T05:08:52Z) - Time Series Continuous Modeling for Imputation and Forecasting with Implicit Neural Representations [15.797295258800638]
We introduce a novel modeling approach for time series imputation and forecasting, tailored to address the challenges often encountered in real-world data.
Our method relies on a continuous-time-dependent model of the series' evolution dynamics.
A modulation mechanism, driven by a meta-learning algorithm, allows adaptation to unseen samples and extrapolation beyond observed time-windows.
arXiv Detail & Related papers (2023-06-09T13:20:04Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - A Statistical-Modelling Approach to Feedforward Neural Network Model Selection [0.8287206589886881]
Feedforward neural networks (FNNs) can be viewed as non-linear regression models.
A novel model selection method is proposed using the Bayesian information criterion (BIC) for FNNs.
The choice of BIC over out-of-sample performance leads to an increased probability of recovering the true model.
arXiv Detail & Related papers (2022-07-09T11:07:04Z) - Randomized Neural Networks for Forecasting Time Series with Multiple
Seasonality [0.0]
This work contributes to the development of neural forecasting models with novel randomization-based learning methods.
A pattern-based representation of time series makes the proposed approach useful for forecasting time series with multiple seasonality.
arXiv Detail & Related papers (2021-07-04T18:39:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.