MC-LSTM: Mass-Conserving LSTM
- URL: http://arxiv.org/abs/2101.05186v2
- Date: Mon, 8 Feb 2021 14:24:37 GMT
- Title: MC-LSTM: Mass-Conserving LSTM
- Authors: Pieter-Jan Hoedt, Frederik Kratzert, Daniel Klotz, Christina Halmich,
Markus Holzleitner, Grey Nearing, Sepp Hochreiter and G\"unter Klambauer
- Abstract summary: We show that Mass-Conserving LSTM adheres to conservation laws by extending the inductive bias of LSTM to model the redistribution of those stored quantities.
MC-LSTM is applied to traffic forecasting, modelling a pendulum, and a large benchmark dataset in hydrology, where it sets a new state-of-the-art for predicting peak flows.
- Score: 4.223874618298011
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of Convolutional Neural Networks (CNNs) in computer vision is
mainly driven by their strong inductive bias, which is strong enough to allow
CNNs to solve vision-related tasks with random weights, meaning without
learning. Similarly, Long Short-Term Memory (LSTM) has a strong inductive bias
towards storing information over time. However, many real-world systems are
governed by conservation laws, which lead to the redistribution of particular
quantities -- e.g. in physical and economical systems. Our novel
Mass-Conserving LSTM (MC-LSTM) adheres to these conservation laws by extending
the inductive bias of LSTM to model the redistribution of those stored
quantities. MC-LSTMs set a new state-of-the-art for neural arithmetic units at
learning arithmetic operations, such as addition tasks, which have a strong
conservation law, as the sum is constant over time. Further, MC-LSTM is applied
to traffic forecasting, modelling a pendulum, and a large benchmark dataset in
hydrology, where it sets a new state-of-the-art for predicting peak flows. In
the hydrology example, we show that MC-LSTM states correlate with real-world
processes and are therefore interpretable.
Related papers
- Unlocking the Power of LSTM for Long Term Time Series Forecasting [27.245021350821638]
We propose a simple yet efficient algorithm named P-sLSTM built upon sLSTM by incorporating patching and channel independence.
These modifications substantially enhance sLSTM's performance in TSF, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-08-19T13:59:26Z) - Implementation Guidelines and Innovations in Quantum LSTM Networks [2.938337278931738]
This paper presents a theoretical analysis and an implementation plan for a Quantum LSTM model, which seeks to integrate quantum computing principles with traditional LSTM networks.
The actual architecture and its practical effectiveness in enhancing sequential data processing remain to be developed and demonstrated in future work.
arXiv Detail & Related papers (2024-06-13T10:26:14Z) - Neuro-mimetic Task-free Unsupervised Online Learning with Continual
Self-Organizing Maps [56.827895559823126]
Self-organizing map (SOM) is a neural model often used in clustering and dimensionality reduction.
We propose a generalization of the SOM, the continual SOM, which is capable of online unsupervised learning under a low memory budget.
Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy.
arXiv Detail & Related papers (2024-02-19T19:11:22Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - On the Representational Capacity of Recurrent Neural Language Models [56.19166912044362]
We show that a rationally weighted RLM with computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.
We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs.
arXiv Detail & Related papers (2023-10-19T17:39:47Z) - DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture
Instantaneous and Long-term Effects in Time Series [26.378073712630467]
We propose a Decomposition-based Linear Explainable LSTM (DeLELSTM) to improve the interpretability of LSTM.
We demonstrate the effectiveness and interpretability of DeLELSTM on three empirical datasets.
arXiv Detail & Related papers (2023-08-26T07:45:41Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Mitigating Out-of-Distribution Data Density Overestimation in
Energy-Based Models [54.06799491319278]
Deep energy-based models (EBMs) are receiving increasing attention due to their ability to learn complex distributions.
To train deep EBMs, the maximum likelihood estimation (MLE) with short-run Langevin Monte Carlo (LMC) is often used.
We investigate why the MLE with short-run LMC can converge to EBMs with wrong density estimates.
arXiv Detail & Related papers (2022-05-30T02:49:17Z) - Simulation of Open Quantum Dynamics with Bootstrap-Based Long Short-Term
Memory Recurrent Neural Network [0.0]
bootstrap method is applied in the LSTM-NN construction and prediction.
bootstrap-based LSTM-NN approach is a practical and powerful tool to propagate the long-time quantum dynamics of open systems.
arXiv Detail & Related papers (2021-08-03T05:58:54Z) - Deep Learning modeling of Limit Order Book: a comparative perspective [0.0]
The present work addresses theoretical and practical questions in the domain of Deep Learning for High Frequency Trading.
State-of-the-art models such as Random models, Logistic Regressions, LSTMs, LSTMs equipped with an Attention mask, CNN-LSTM and Attentions are reviewed and compared on the same tasks.
The underlying dimensions of the modeling techniques are investigated to understand whether these are intrinsic to the Limit Order Book's dynamics.
arXiv Detail & Related papers (2020-07-12T17:06:30Z) - Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative.
DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances.
Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.