Improved Initialization of State-Space Artificial Neural Networks
- URL: http://arxiv.org/abs/2103.14516v1
- Date: Fri, 26 Mar 2021 15:16:08 GMT
- Title: Improved Initialization of State-Space Artificial Neural Networks
- Authors: Maarten Schoukens
- Abstract summary: The identification of black-box nonlinear state-space models requires a flexible representation of the state and output equation.
This paper introduces an improved approach for nonlinear state-space models represented as a recurrent artificial neural network.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The identification of black-box nonlinear state-space models requires a
flexible representation of the state and output equation. Artificial neural
networks have proven to provide such a representation. However, as in many
identification problems, a nonlinear optimization problem needs to be solved to
obtain the model parameters (layer weights and biases). A well-thought
initialization of these model parameters can often avoid that the nonlinear
optimization algorithm converges to a poorly performing local minimum of the
considered cost function. This paper introduces an improved initialization
approach for nonlinear state-space models represented as a recurrent artificial
neural network and emphasizes the importance of including an explicit linear
term in the model structure. Some of the neural network weights are initialized
starting from a linear approximation of the nonlinear system, while others are
initialized using random values or zeros. The effectiveness of the proposed
initialization approach over previously proposed methods is illustrated on two
benchmark examples.
Related papers
- The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Initialization Approach for Nonlinear State-Space Identification via the
Subspace Encoder Approach [0.0]
SUBNET has been developed to identify nonlinear state-space models from input-output data.
State encoder function is introduced to reconstruct the current state from past input-output data.
This paper focuses on an initialisation of the subspace encoder approach using the Best Linear Approximation (BLA)
arXiv Detail & Related papers (2023-04-04T20:57:34Z) - Neural Abstractions [72.42530499990028]
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics.
We demonstrate that our approach performs comparably to the mature tool Flow* on existing benchmark nonlinear models.
arXiv Detail & Related papers (2023-01-27T12:38:09Z) - A Priori Denoising Strategies for Sparse Identification of Nonlinear
Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements.
We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z) - Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies.
We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z) - On the Explicit Role of Initialization on the Convergence and Implicit
Bias of Overparametrized Linear Networks [1.0323063834827415]
We present a novel analysis of single-hidden-layer linear networks trained under gradient flow.
We show that the squared loss converges exponentially to its optimum.
We derive a novel non-asymptotic upper-bound on the distance between the trained network and the min-norm solution.
arXiv Detail & Related papers (2021-05-13T15:13:51Z) - Convergence rates for gradient descent in the training of
overparameterized artificial neural networks with biases [3.198144010381572]
In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches.
It is still unclear why randomly gradient descent algorithms reach their limits.
arXiv Detail & Related papers (2021-02-23T18:17:47Z) - Parameter Estimation with Dense and Convolutional Neural Networks
Applied to the FitzHugh-Nagumo ODE [0.0]
We present deep neural networks using dense and convolutional layers to solve an inverse problem, where we seek to estimate parameters of a Fitz-Nagumo model.
We demonstrate that deep neural networks have the potential to estimate parameters in dynamical models and processes, and they are capable of predicting parameters accurately for the framework.
arXiv Detail & Related papers (2020-12-12T01:20:42Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Loss landscapes and optimization in over-parameterized non-linear
systems and neural networks [20.44438519046223]
We show that wide neural networks satisfy the PL$*$ condition, which explains the (S)GD convergence to a global minimum.
We show that wide neural networks satisfy the PL$*$ condition, which explains the (S)GD convergence to a global minimum.
arXiv Detail & Related papers (2020-02-29T17:18:28Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.