Related papers: Generative Temporal Difference Learning for Infinite-Horizon Prediction

Generative Temporal Difference Learning for Infinite-Horizon Prediction

URL: http://arxiv.org/abs/2010.14496v4
Date: Mon, 29 Nov 2021 00:51:39 GMT
Title: Generative Temporal Difference Learning for Infinite-Horizon Prediction
Authors: Michael Janner, Igor Mordatch, Sergey Levine
Abstract summary: We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
Score: 101.59882753763888
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with $\gamma$-models leads to generalizations of the procedures central to model-based control, including the model rollout and model-based value estimation. The $\gamma$-model, trained with a generative reinterpretation of temporal difference learning, is a natural continuous analogue of the successor representation and a hybrid between model-free and model-based mechanisms. Like a value function, it contains information about the long-term future; like a standard predictive model, it is independent of task reward. We instantiate the $\gamma$-model as both a generative adversarial network and normalizing flow, discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors, and empirically investigate its utility for prediction and control.

Related papers

A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios [11.141688859736805]
We introduce a machine learning model for credit risk by combining tree-boosting with a latent-temporal and Gaussian process model accounting for frailty correlation.<n>We find that both predictive default probabilities for individual loans and predictive loan portfolio loss distributions are more accurate compared to conventional independent linear hazard models.
arXiv Detail & Related papers (2024-10-03T15:10:55Z)
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL [50.385005413810084]
Dyna-style model-based reinforcement learning contains two phases: model rollouts to generate sample for policy learning and real environment exploration. $textttCOPlanner$ is a planning-driven framework for model-based methods to address the inaccurately learned dynamics model problem.
arXiv Detail & Related papers (2023-10-11T06:10:07Z)
EAMDrift: An interpretable self retrain model for time series [0.0]
We present EAMDrift, a novel method that combines forecasts from multiple individual predictors by weighting each prediction according to a performance metric. EAMDrift is designed to automatically adapt to out-of-distribution patterns in data and identify the most appropriate models to use at each moment. Our study on real-world datasets shows that EAMDrift outperforms individual baseline models by 20% and achieves comparable accuracy results to non-interpretable ensemble models.
arXiv Detail & Related papers (2023-05-31T13:25:26Z)
Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA) Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space. We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z)
Neural Superstatistics for Bayesian Estimation of Dynamic Cognitive Models [2.7391842773173334]
We develop a simulation-based deep learning method for Bayesian inference, which can recover both time-varying and time-invariant parameters. Our results show that the deep learning approach is very efficient in capturing the temporal dynamics of the model.
arXiv Detail & Related papers (2022-11-23T17:42:53Z)
Domain-aware Control-oriented Neural Models for Autonomous Underwater Vehicles [2.4779082385578337]
We present control-oriented parametric models with varying levels of domain-awareness. We employ universal differential equations to construct data-driven blackbox and graybox representations of the AUV dynamics.
arXiv Detail & Related papers (2022-08-15T17:01:14Z)
Stochastic Parameterizations: Better Modelling of Temporal Correlations using Probabilistic Machine Learning [1.5293427903448025]
We show that by using a physically-informed recurrent neural network within a probabilistic framework, our model for the 96 atmospheric simulation is competitive. This is due to a superior ability to model temporal correlations compared to standard first-order autoregressive schemes. We evaluate across a number of metrics from the literature, but also discuss how the probabilistic metric of likelihood may be a unifying choice for future climate models.
arXiv Detail & Related papers (2022-03-28T14:51:42Z)
Bellman: A Toolbox for Model-Based Reinforcement Learning in TensorFlow [14.422129911404472]
Bellman aims to fill this gap and introduces the first thoroughly designed and tested model-based RL toolbox. Our modular approach enables to combine a wide range of environment models with generic model-based agent classes that recover state-of-the-art algorithms.
arXiv Detail & Related papers (2021-03-26T11:32:27Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.