Related papers: Initializing LSTM internal states via manifold learning

Initializing LSTM internal states via manifold learning

URL: http://arxiv.org/abs/2104.13101v1
Date: Tue, 27 Apr 2021 10:54:53 GMT
Title: Initializing LSTM internal states via manifold learning
Authors: Felix P. Kemeth, Tom Bertalan, Nikolaos Evangelou, Tianqi Cui, Saurabh Malani, Ioannis G. Kevrekidis
Abstract summary: We argue that the converged, "mature" internal states constitute a function on this learned manifold. We show that learning this data manifold enables the transformation of partially observed dynamics into fully observed ones.
Score: 0.6524460254566904
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present an approach, based on learning an intrinsic data manifold, for the initialization of the internal state values of LSTM recurrent neural networks, ensuring consistency with the initial observed input data. Exploiting the generalized synchronization concept, we argue that the converged, "mature" internal states constitute a function on this learned manifold. The dimension of this manifold then dictates the length of observed input time series data required for consistent initialization. We illustrate our approach through a partially observed chemical model system, where initializing the internal LSTM states in this fashion yields visibly improved performance. Finally, we show that learning this data manifold enables the transformation of partially observed dynamics into fully observed ones, facilitating alternative identification paths for nonlinear dynamical systems.

Related papers

Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data. We train the model using maximum likelihood estimation with Markov chain Monte Carlo. Experiments on oscillating systems, videos and real-world state sequences (MuJoCo) illustrate that ODEs with the learnable energy-based prior outperform existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z)
Modeling Spatio-temporal Dynamical Systems with Neural Discrete Learning and Levels-of-Experts [33.335735613579914]
We address the issue of modeling and estimating changes in the state oftemporal- dynamical systems based on a sequence of observations like video frames. This paper propose the universal expert module -- that is, optical flow estimation component, to capture the laws of general physical processes in a data-driven fashion. We conduct extensive experiments and ablations to demonstrate that the proposed framework achieves large performance margins, compared with the existing SOTA baselines.
arXiv Detail & Related papers (2024-02-06T06:27:07Z)
Complex Recurrent Spectral Network [1.0499611180329806]
This paper presents a novel approach to advancing artificial intelligence (AI) through the development of the Complex Recurrent Spectral Network ($mathbbC$-RSN) The $mathbbC$-RSN is designed to address a critical limitation in existing neural network models: their inability to emulate the complex processes of biological neural networks.
arXiv Detail & Related papers (2023-12-12T14:14:40Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
Reconstruction, forecasting, and stability of chaotic dynamics from partial data [4.266376725904727]
We propose data-driven methods to infer the dynamics of hidden chaotic variables from partial observations. We show that the proposed networks can forecast the hidden variables, both time-accurately and statistically. This work opens new opportunities for reconstructing the full state, inferring hidden variables, and computing the stability of chaotic systems from partial data.
arXiv Detail & Related papers (2023-05-24T13:01:51Z)
Anamnesic Neural Differential Equations with Orthogonal Polynomial Projections [6.345523830122166]
We propose PolyODE, a formulation that enforces long-range memory and preserves a global representation of the underlying dynamical system. Our construction is backed by favourable theoretical guarantees and we demonstrate that it outperforms previous works in the reconstruction of past and future data.
arXiv Detail & Related papers (2023-03-03T10:49:09Z)
Integrating Recurrent Neural Networks with Data Assimilation for Scalable Data-Driven State Estimation [0.0]
Data assimilation (DA) is integrated with machine learning to perform entirely data-driven online state estimation. recurrent neural networks (RNNs) are implemented as surrogate models to replace key components of the DA cycle in numerical weather prediction (NWP) It is shown how these RNNs can be using DA methods to directly update the hidden/reservoir state with observations of the target system.
arXiv Detail & Related papers (2021-09-25T03:56:53Z)
Supervised DKRC with Images for Offline System Identification [77.34726150561087]
Modern dynamical systems are becoming increasingly non-linear and complex. There is a need for a framework to model these systems in a compact and comprehensive representation for prediction and control. Our approach learns these basis functions using a supervised learning approach.
arXiv Detail & Related papers (2021-09-06T04:39:06Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms. We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.