On solutions of the distributional Bellman equation
- URL: http://arxiv.org/abs/2202.00081v3
- Date: Fri, 26 May 2023 11:54:28 GMT
- Title: On solutions of the distributional Bellman equation
- Authors: Julian Gerstenberg, Ralph Neininger, Denis Spiegel
- Abstract summary: We consider general distributional Bellman equations and study existence and uniqueness of their solutions as well as tail properties of return distributions.
We show that any solution of a distributional Bellman equation can be obtained as the vector of marginal laws of a solution to a multivariate affine distributional equation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In distributional reinforcement learning not only expected returns but the
complete return distributions of a policy are taken into account. The return
distribution for a fixed policy is given as the solution of an associated
distributional Bellman equation. In this note we consider general
distributional Bellman equations and study existence and uniqueness of their
solutions as well as tail properties of return distributions. We give necessary
and sufficient conditions for existence and uniqueness of return distributions
and identify cases of regular variation. We link distributional Bellman
equations to multivariate affine distributional equations. We show that any
solution of a distributional Bellman equation can be obtained as the vector of
marginal laws of a solution to a multivariate affine distributional equation.
This makes the general theory of such equations applicable to the
distributional reinforcement learning setting.
Related papers
- Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing [55.791818510796645]
We aim to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data.
Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge.
We adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain.
arXiv Detail & Related papers (2024-10-08T12:26:48Z) - LLQL: Logistic Likelihood Q-Learning for Reinforcement Learning [1.5734309088976395]
This study investigates the distribution of the Bellman approximation error through iterative exploration of the Bellman equation.
We propose the utilization of the Logistic maximum likelihood function (LLoss) as an alternative to the commonly used mean squared error (MSELoss) that assumes a Normal distribution for Bellman errors.
arXiv Detail & Related papers (2023-07-05T15:00:29Z) - Distributional Reinforcement Learning with Dual Expectile-Quantile Regression [51.87411935256015]
quantile regression approach to distributional RL provides flexible and effective way of learning arbitrary return distributions.
We show that distributional guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean estimation.
Motivated by the efficiency of $L$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning while keeping an estimate of the full distribution of returns.
arXiv Detail & Related papers (2023-05-26T12:30:05Z) - Policy Evaluation in Distributional LQR [70.63903506291383]
We provide a closed-form expression of the distribution of the random return.
We show that this distribution can be approximated by a finite number of random variables.
Using the approximate return distribution, we propose a zeroth-order policy gradient algorithm for risk-averse LQR.
arXiv Detail & Related papers (2023-03-23T20:27:40Z) - Domain Generalization by Functional Regression [3.209698860006188]
We study domain generalization as a problem of functional regression.
Our concept leads to a new algorithm for learning a linear operator from marginal distributions of inputs to the corresponding conditional distributions of outputs given inputs.
arXiv Detail & Related papers (2023-02-09T16:07:21Z) - Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time
Reinforcement Learning [39.07307690074323]
We consider the problem of predicting the distribution of returns obtained by an agent interacting in a continuous-time environment.
Accurate return predictions have proven useful for determining optimal policies for risk-sensitive control, state representations, multiagent coordination, and more.
We propose a tractable algorithm for approximately solving the distributional HJB based on a JKO scheme, which can be implemented in an online control algorithm.
arXiv Detail & Related papers (2022-05-24T16:33:54Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks.
Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target.
Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z) - A Distributional Analysis of Sampling-Based Reinforcement Learning
Algorithms [67.67377846416106]
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.
We show that value-based methods such as TD($lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions.
arXiv Detail & Related papers (2020-03-27T05:13:29Z) - Exploring Maximum Entropy Distributions with Evolutionary Algorithms [0.0]
We show how to evolve numerically the maximum entropy probability distributions for a given set of constraints.
An evolutionary algorithm can obtain approximations to some well-known analytical results.
We explain why many of the distributions are symmetrical and continuous, but some are not.
arXiv Detail & Related papers (2020-02-05T19:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.