Explainable nonlinear modelling of multiple time series with invertible
neural networks
- URL: http://arxiv.org/abs/2107.00391v1
- Date: Thu, 1 Jul 2021 12:07:09 GMT
- Title: Explainable nonlinear modelling of multiple time series with invertible
neural networks
- Authors: Luis Miguel Lopez-Ramos, Kevin Roy, Baltasar Beferull-Lozano
- Abstract summary: A method for nonlinear topology identification is proposed, based on the assumption that a collection of time series are generated in two steps.
The latter mappings are assumed invertible, and are modelled as shallow neural networks, so that their inverse can be numerically evaluated.
This paper explains the steps needed to calculate the gradients applying implicit differentiation.
- Score: 7.605814048051735
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A method for nonlinear topology identification is proposed, based on the
assumption that a collection of time series are generated in two steps: i) a
vector autoregressive process in a latent space, and ii) a nonlinear,
component-wise, monotonically increasing observation mapping. The latter
mappings are assumed invertible, and are modelled as shallow neural networks,
so that their inverse can be numerically evaluated, and their parameters can be
learned using a technique inspired in deep learning. Due to the function
inversion, the back-propagation step is not straightforward, and this paper
explains the steps needed to calculate the gradients applying implicit
differentiation. Whereas the model explainability is the same as that for
linear VAR processes, preliminary numerical tests show that the prediction
error becomes smaller.
Related papers
- Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning.
This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z) - Efficient Interpretable Nonlinear Modeling for Multiple Time Series [5.448070998907116]
This paper proposes an efficient nonlinear modeling approach for multiple time series.
It incorporates nonlinear interactions among different time-series variables.
Experimental results show that the proposed algorithm improves the identification of the support of the VAR coefficients in a parsimonious manner.
arXiv Detail & Related papers (2023-09-29T11:42:59Z) - Uncovering mesa-optimization algorithms in Transformers [61.06055590704677]
Some autoregressive models can learn as an input sequence is processed, without undergoing any parameter changes, and without being explicitly trained to do so.
We show that standard next-token prediction error minimization gives rise to a subsidiary learning algorithm that adjusts the model as new inputs are revealed.
Our findings explain in-context learning as a product of autoregressive loss minimization and inform the design of new optimization-based Transformer layers.
arXiv Detail & Related papers (2023-09-11T22:42:50Z) - Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic
Analysis For DDIM-Type Samplers [90.45898746733397]
We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling.
We show that one step along the probability flow ODE can be expressed as two steps: 1) a restoration step that runs ascent on the conditional log-likelihood at some infinitesimally previous time, and 2) a degradation step that runs the forward process using noise pointing back towards the current gradient.
arXiv Detail & Related papers (2023-03-06T18:59:19Z) - On the Detection and Quantification of Nonlinearity via Statistics of
the Gradients of a Black-Box Model [0.0]
Detection and identification of nonlinearity is a task of high importance for structural dynamics.
A method to detect nonlinearity is proposed, based on the distribution of the gradients of a data-driven model.
arXiv Detail & Related papers (2023-02-15T23:15:22Z) - Non-linear manifold ROM with Convolutional Autoencoders and Reduced
Over-Collocation method [0.0]
Non-affine parametric dependencies, nonlinearities and advection-dominated regimes of the model of interest can result in a slow Kolmogorov n-width decay.
We implement the non-linear manifold method introduced by Carlberg et al [37] with hyper-reduction achieved through reduced over-collocation and teacher-student training of a reduced decoder.
We test the methodology on a 2d non-linear conservation law and a 2d shallow water models, and compare the results obtained with a purely data-driven method for which the dynamics is evolved in time with a long-short term memory network
arXiv Detail & Related papers (2022-03-01T11:16:50Z) - Memory-Efficient Backpropagation through Large Linear Layers [107.20037639738433]
In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass.
This study proposes a memory reduction approach to perform backpropagation through linear layers.
arXiv Detail & Related papers (2022-01-31T13:02:41Z) - Learning Nonlinear Waves in Plasmon-induced Transparency [0.0]
We consider a recurrent neural network (RNN) approach to predict the complex propagation of nonlinear solitons in plasmon-induced transparency metamaterial systems.
We prove the prominent agreement of results in simulation and prediction by long short-term memory (LSTM) artificial neural networks.
arXiv Detail & Related papers (2021-07-31T21:21:44Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.