Related papers: MC-LSTM: Mass-Conserving LSTM

MC-LSTM: Mass-Conserving LSTM

URL: http://arxiv.org/abs/2101.05186v2
Date: Mon, 8 Feb 2021 14:24:37 GMT
Title: MC-LSTM: Mass-Conserving LSTM
Authors: Pieter-Jan Hoedt, Frederik Kratzert, Daniel Klotz, Christina Halmich, Markus Holzleitner, Grey Nearing, Sepp Hochreiter and G\"unter Klambauer
Abstract summary: We show that Mass-Conserving LSTM adheres to conservation laws by extending the inductive bias of LSTM to model the redistribution of those stored quantities. MC-LSTM is applied to traffic forecasting, modelling a pendulum, and a large benchmark dataset in hydrology, where it sets a new state-of-the-art for predicting peak flows.
Score: 4.223874618298011
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The success of Convolutional Neural Networks (CNNs) in computer vision is mainly driven by their strong inductive bias, which is strong enough to allow CNNs to solve vision-related tasks with random weights, meaning without learning. Similarly, Long Short-Term Memory (LSTM) has a strong inductive bias towards storing information over time. However, many real-world systems are governed by conservation laws, which lead to the redistribution of particular quantities -- e.g. in physical and economical systems. Our novel Mass-Conserving LSTM (MC-LSTM) adheres to these conservation laws by extending the inductive bias of LSTM to model the redistribution of those stored quantities. MC-LSTMs set a new state-of-the-art for neural arithmetic units at learning arithmetic operations, such as addition tasks, which have a strong conservation law, as the sum is constant over time. Further, MC-LSTM is applied to traffic forecasting, modelling a pendulum, and a large benchmark dataset in hydrology, where it sets a new state-of-the-art for predicting peak flows. In the hydrology example, we show that MC-LSTM states correlate with real-world processes and are therefore interpretable.

Related papers

Learning to Dissipate Energy in Oscillatory State-Space Models [55.09730499143998]
State-space models (SSMs) are a class of networks for sequence learning.<n>We show that D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks.
arXiv Detail & Related papers (2025-05-17T23:15:17Z)
Federated Quantum-Train Long Short-Term Memory for Gravitational Wave Signal [3.360429911727189]
We present Federated QT-LSTM, a novel framework that combines the Quantum-Train (QT) methodology with Long Short-Term Memory (LSTM) networks in a federated learning setup. By leveraging quantum neural networks (QNNs) to generate classical LSTM model parameters during training, the framework effectively addresses challenges in model compression, scalability, and computational efficiency.
arXiv Detail & Related papers (2025-03-20T11:34:13Z)
Unlocking the Power of LSTM for Long Term Time Series Forecasting [27.245021350821638]
We propose a simple yet efficient algorithm named P-sLSTM built upon sLSTM by incorporating patching and channel independence. These modifications substantially enhance sLSTM's performance in TSF, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-08-19T13:59:26Z)
Implementation Guidelines and Innovations in Quantum LSTM Networks [2.938337278931738]
This paper presents a theoretical analysis and an implementation plan for a Quantum LSTM model, which seeks to integrate quantum computing principles with traditional LSTM networks. The actual architecture and its practical effectiveness in enhancing sequential data processing remain to be developed and demonstrated in future work.
arXiv Detail & Related papers (2024-06-13T10:26:14Z)
Neuro-mimetic Task-free Unsupervised Online Learning with Continual Self-Organizing Maps [56.827895559823126]
Self-organizing map (SOM) is a neural model often used in clustering and dimensionality reduction. We propose a generalization of the SOM, the continual SOM, which is capable of online unsupervised learning under a low memory budget. Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy.
arXiv Detail & Related papers (2024-02-19T19:11:22Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
On the Representational Capacity of Recurrent Neural Language Models [56.19166912044362]
We show that a rationally weighted RLM with computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions. We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs.
arXiv Detail & Related papers (2023-10-19T17:39:47Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)
DeLELSTM: Decomposition-based Linear Explainable LSTM to Capture Instantaneous and Long-term Effects in Time Series [26.378073712630467]
We propose a Decomposition-based Linear Explainable LSTM (DeLELSTM) to improve the interpretability of LSTM. We demonstrate the effectiveness and interpretability of DeLELSTM on three empirical datasets.
arXiv Detail & Related papers (2023-08-26T07:45:41Z)
Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex. In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z)
Mitigating Out-of-Distribution Data Density Overestimation in Energy-Based Models [54.06799491319278]
Deep energy-based models (EBMs) are receiving increasing attention due to their ability to learn complex distributions. To train deep EBMs, the maximum likelihood estimation (MLE) with short-run Langevin Monte Carlo (LMC) is often used. We investigate why the MLE with short-run LMC can converge to EBMs with wrong density estimates.
arXiv Detail & Related papers (2022-05-30T02:49:17Z)
Simulation of Open Quantum Dynamics with Bootstrap-Based Long Short-Term Memory Recurrent Neural Network [0.0]
bootstrap method is applied in the LSTM-NN construction and prediction. bootstrap-based LSTM-NN approach is a practical and powerful tool to propagate the long-time quantum dynamics of open systems.
arXiv Detail & Related papers (2021-08-03T05:58:54Z)
Deep Learning modeling of Limit Order Book: a comparative perspective [0.0]
The present work addresses theoretical and practical questions in the domain of Deep Learning for High Frequency Trading. State-of-the-art models such as Random models, Logistic Regressions, LSTMs, LSTMs equipped with an Attention mask, CNN-LSTM and Attentions are reviewed and compared on the same tasks. The underlying dimensions of the modeling techniques are investigated to understand whether these are intrinsic to the Limit Order Book's dynamics.
arXiv Detail & Related papers (2020-07-12T17:06:30Z)
Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative. DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances. Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.