Hidden Markov Neural Networks
- URL: http://arxiv.org/abs/2004.06963v3
- Date: Thu, 16 Jan 2025 08:32:50 GMT
- Title: Hidden Markov Neural Networks
- Authors: Lorenzo Rimella, Nick Whiteley,
- Abstract summary: We define an evolving in-time Bayesian neural network called a Hidden Markov Neural Network.
Experiments on MNIST, dynamic classification tasks, and next-frame forecasting in videos demonstrate that Hidden Markov Neural Networks provide strong predictive performance.
- Score: 5.8317379706611385
- License:
- Abstract: We define an evolving in-time Bayesian neural network called a Hidden Markov Neural Network, which addresses the crucial challenge in time-series forecasting and continual learning: striking a balance between adapting to new data and appropriately forgetting outdated information. This is achieved by modelling the weights of a neural network as the hidden states of a Hidden Markov model, with the observed process defined by the available data. A filtering algorithm is employed to learn a variational approximation of the evolving-in-time posterior distribution over the weights. By leveraging a sequential variant of Bayes by Backprop, enriched with a stronger regularization technique called variational DropConnect, Hidden Markov Neural Networks achieve robust regularization and scalable inference. Experiments on MNIST, dynamic classification tasks, and next-frame forecasting in videos demonstrate that Hidden Markov Neural Networks provide strong predictive performance while enabling effective uncertainty quantification.
Related papers
- GARCH-Informed Neural Networks for Volatility Prediction in Financial Markets [0.0]
We present a new, hybrid Deep Learning model that captures and forecasting market volatility more accurately than either class of models are capable of on their own.
When compared to other time series models, GINN showed superior out-of-sample prediction performance in terms of the Coefficient of Determination ($R2$), Mean Squared Error (MSE), and Mean Absolute Error (MAE)
arXiv Detail & Related papers (2024-09-30T23:53:54Z) - Positional Encoder Graph Quantile Neural Networks for Geographic Data [4.277516034244117]
We introduce the Positional Graph Quantile Neural Network (PE-GQNN), a novel method that integrates PE-GNNs, Quantile Neural Networks, and recalibration techniques in a fully nonparametric framework.
Experiments on benchmark datasets demonstrate that PE-GQNN significantly outperforms existing state-of-the-art methods in both predictive accuracy and uncertainty quantification.
arXiv Detail & Related papers (2024-09-27T16:02:12Z) - Detecting Markovianity of Quantum Processes via Recurrent Neural Networks [0.0]
We present a novel methodology utilizing Recurrent Neural Networks (RNNs) to classify Markovian and non-Markovian quantum processes.
The model exhibits exceptional accuracy, surpassing 95%, across diverse scenarios.
arXiv Detail & Related papers (2024-06-11T13:05:36Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - On the Convergence of Locally Adaptive and Scalable Diffusion-Based Sampling Methods for Deep Bayesian Neural Network Posteriors [2.3265565167163906]
Bayesian neural networks are a promising approach for modeling uncertainties in deep neural networks.
generating samples from the posterior distribution of neural networks is a major challenge.
One advance in that direction would be the incorporation of adaptive step sizes into Monte Carlo Markov chain sampling algorithms.
In this paper, we demonstrate that these methods can have a substantial bias in the distribution they sample, even in the limit of vanishing step sizes and at full batch size.
arXiv Detail & Related papers (2024-03-13T15:21:14Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Stochastic Recurrent Neural Network for Multistep Time Series
Forecasting [0.0]
We leverage advances in deep generative models and the concept of state space models to propose an adaptation of the recurrent neural network for time series forecasting.
Our model preserves the architectural workings of a recurrent neural network for which all relevant information is encapsulated in its hidden states, and this flexibility allows our model to be easily integrated into any deep architecture for sequential modelling.
arXiv Detail & Related papers (2021-04-26T01:43:43Z) - Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks [77.34726150561087]
We introduce Gradient Markov Descent (SMGD), a discrete optimization method applicable to training quantized neural networks.
We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
arXiv Detail & Related papers (2020-08-25T15:48:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.