Learning neural state-space models: do we need a state estimator?
- URL: http://arxiv.org/abs/2206.12928v1
- Date: Sun, 26 Jun 2022 17:15:35 GMT
- Title: Learning neural state-space models: do we need a state estimator?
- Authors: Marco Forgione, Manas Mejari, Dario Piga
- Abstract summary: We provide insights for calibration of neural state-space training algorithms based on extensive experimentation and analyses.
Specific focus is given to the choice and the role of the initial state estimation.
We demonstrate that advanced initial state estimation techniques are really required to achieve high performance on certain classes of dynamical systems.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, several algorithms for system identification with neural
state-space models have been introduced. Most of the proposed approaches are
aimed at reducing the computational complexity of the learning problem, by
splitting the optimization over short sub-sequences extracted from a longer
training dataset. Different sequences are then processed simultaneously within
a minibatch, taking advantage of modern parallel hardware for deep learning. An
issue arising in these methods is the need to assign an initial state for each
of the sub-sequences, which is required to run simulations and thus to evaluate
the fitting loss. In this paper, we provide insights for calibration of neural
state-space training algorithms based on extensive experimentation and analyses
performed on two recognized system identification benchmarks. Particular focus
is given to the choice and the role of the initial state estimation. We
demonstrate that advanced initial state estimation techniques are really
required to achieve high performance on certain classes of dynamical systems,
while for asymptotically stable ones basic procedures such as zero or random
initialization already yield competitive performance.
Related papers
- Discrete Neural Algorithmic Reasoning [18.497863598167257]
We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states.
trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm.
arXiv Detail & Related papers (2024-02-18T16:03:04Z) - Splitter Orderings for Probabilistic Bisimulation [0.0]
We propose techniques to accelerate iterative processes to partition state space of a given probabilistic model to its bisimulation classes.
The proposed approaches are implemented and run on several conventional case studies and reduce the running time by one order of magnitude on average.
arXiv Detail & Related papers (2023-07-17T16:30:19Z) - Forward-Forward Algorithm for Hyperspectral Image Classification: A
Preliminary Study [0.0]
Forward-forward algorithm (FFA) computes local goodness functions to optimize network parameters.
This study investigates the application of FFA for hyperspectral image classification.
arXiv Detail & Related papers (2023-07-01T05:39:28Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - Robust Learning of Parsimonious Deep Neural Networks [0.0]
We propose a simultaneous learning and pruning algorithm capable of identifying and eliminating irrelevant structures in a neural network.
We derive a novel hyper-prior distribution over the prior parameters that is crucial for their optimal selection.
We evaluate the proposed algorithm on the MNIST data set and commonly used fully connected and convolutional LeNet architectures.
arXiv Detail & Related papers (2022-05-10T03:38:55Z) - Assessment of machine learning methods for state-to-state approaches [0.0]
We investigate the possibilities offered by the use of machine learning methods for state-to-state approaches.
Deep neural networks appear to be a viable technology also for these tasks.
arXiv Detail & Related papers (2021-04-02T13:27:23Z) - Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z) - Large-scale Neural Solvers for Partial Differential Equations [48.7576911714538]
Solving partial differential equations (PDE) is an indispensable part of many branches of science as many processes can be modelled in terms of PDEs.
Recent numerical solvers require manual discretization of the underlying equation as well as sophisticated, tailored code for distributed computing.
We examine the applicability of continuous, mesh-free neural solvers for partial differential equations, physics-informed neural networks (PINNs)
We discuss the accuracy of GatedPINN with respect to analytical solutions -- as well as state-of-the-art numerical solvers, such as spectral solvers.
arXiv Detail & Related papers (2020-09-08T13:26:51Z) - Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies.
We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.