Learning Recurrent Models with Temporally Local Rules
- URL: http://arxiv.org/abs/2310.13284v1
- Date: Fri, 20 Oct 2023 05:30:30 GMT
- Title: Learning Recurrent Models with Temporally Local Rules
- Authors: Azwar Abdulsalam and Joseph G. Makin
- Abstract summary: We show how a generative model can learn the joint distribution over current and previous states, rather than merely the transition probabilities.
We show on toy datasets that different architectures employing this principle can learn aspects of the data typically requiring the backward pass.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fitting generative models to sequential data typically involves two recursive
computations through time, one forward and one backward. The latter could be a
computation of the loss gradient (as in backpropagation through time), or an
inference algorithm (as in the RTS/Kalman smoother). The backward pass in
particular is computationally expensive (since it is inherently serial and
cannot exploit GPUs), and difficult to map onto biological processes.
Work-arounds have been proposed; here we explore a very different one:
requiring the generative model to learn the joint distribution over current and
previous states, rather than merely the transition probabilities. We show on
toy datasets that different architectures employing this principle can learn
aspects of the data typically requiring the backward pass.
Related papers
- Streaming Factor Trajectory Learning for Temporal Tensor Decomposition [33.18423605559094]
We propose Streaming Factor Trajectory Learning for temporal tensor decomposition.
We use Gaussian processes (GPs) to model the trajectory of factors so as to flexibly estimate their temporal evolution.
We have shown the advantage of SFTL in both synthetic tasks and real-world applications.
arXiv Detail & Related papers (2023-10-25T21:58:52Z) - LARA: A Light and Anti-overfitting Retraining Approach for Unsupervised
Time Series Anomaly Detection [49.52429991848581]
We propose a Light and Anti-overfitting Retraining Approach (LARA) for deep variational auto-encoder based time series anomaly detection methods (VAEs)
This work aims to make three novel contributions: 1) the retraining process is formulated as a convex problem and can converge at a fast rate as well as prevent overfitting; 2) designing a ruminate block, which leverages the historical data without the need to store them; and 3) mathematically proving that when fine-tuning the latent vector and reconstructed data, the linear formations can achieve the least adjusting errors between the ground truths and the fine-tuned ones.
arXiv Detail & Related papers (2023-10-09T12:36:16Z) - What learning algorithm is in-context learning? Investigations with
linear models [87.91612418166464]
We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly.
We show that trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression.
Preliminary evidence that in-context learners share algorithmic features with these predictors.
arXiv Detail & Related papers (2022-11-28T18:59:51Z) - Time-Varying Propensity Score to Bridge the Gap between the Past and Present [104.46387765330142]
We introduce a time-varying propensity score that can detect gradual shifts in the distribution of data.
We demonstrate different ways of implementing it and evaluate it on a variety of problems.
arXiv Detail & Related papers (2022-10-04T07:21:49Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - The Effects of Invertibility on the Representational Complexity of
Encoders in Variational Autoencoders [16.27499951949733]
We show that if the generative map is "strongly invertible" (in a sense we suitably formalize), the inferential model need not be much more complex.
Importantly, we do not require the generative model to be layerwise invertible.
We provide theoretical support for the empirical wisdom that learning deep generative models is harder when data lies on a low-dimensional manifold.
arXiv Detail & Related papers (2021-07-09T19:53:29Z) - Time Series Data Imputation: A Survey on Deep Learning Approaches [4.4458738910060775]
Time series data imputation is a well-studied problem with different categories of methods.
Time series methods based on deep learning have made progress with the usage of models like RNN.
We will review and discuss their model architectures, their pros and cons as well as their effects to show the development of the time series imputation methods.
arXiv Detail & Related papers (2020-11-23T11:57:27Z) - Splitting Gaussian Process Regression for Streaming Data [1.2691047660244335]
We propose an algorithm for sequentially partitioning the input space and fitting a localized Gaussian process to each disjoint region.
The algorithm is shown to have superior time and space complexity to existing methods, and its sequential nature permits application to streaming data.
To the best of our knowledge, the model is the first local Gaussian process regression model to achieve linear memory complexity.
arXiv Detail & Related papers (2020-10-06T01:37:13Z) - The data-driven physical-based equations discovery using evolutionary
approach [77.34726150561087]
We describe the algorithm for the mathematical equations discovery from the given observations data.
The algorithm combines genetic programming with the sparse regression.
It could be used for governing analytical equation discovery as well as for partial differential equations (PDE) discovery.
arXiv Detail & Related papers (2020-04-03T17:21:57Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.