IGN : Implicit Generative Networks
- URL: http://arxiv.org/abs/2206.05860v1
- Date: Mon, 13 Jun 2022 00:02:23 GMT
- Title: IGN : Implicit Generative Networks
- Authors: Haozheng Luo, Tianyi Wu, Feiyu Han, Zhijun Yan, Jianfen Zhang
- Abstract summary: We build recent advances in distributional reinforcement learning to give a state-of-art distributional variant of the model based on the IQN.
We demonstrate improved performance on our baseline dataset - 57 Atari 2600 games in the ALE.
Also, we use our algorithm to show the state-of-art training performance of risk-sensitive policies in Atari games with the policy optimization and evaluation.
- Score: 5.394800220750409
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we build recent advances in distributional reinforcement
learning to give a state-of-art distributional variant of the model based on
the IQN. We achieve this by using the GAN model's generator and discriminator
function with the quantile regression to approximate the full quantile value
for the state-action return distribution. We demonstrate improved performance
on our baseline dataset - 57 Atari 2600 games in the ALE. Also, we use our
algorithm to show the state-of-art training performance of risk-sensitive
policies in Atari games with the policy optimization and evaluation.
Related papers
- PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods [0.0]
We introduce PG-Rainbow, a novel algorithm that incorporates a distributional reinforcement learning framework with a policy gradient algorithm.
We show empirical results that through the integration of reward distribution information into the policy network, the policy agent acquires enhanced capabilities.
arXiv Detail & Related papers (2024-07-18T04:18:52Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Value-Distributional Model-Based Reinforcement Learning [59.758009422067]
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks.
We study the problem from a model-based Bayesian reinforcement learning perspective.
We propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function.
arXiv Detail & Related papers (2023-08-12T14:59:19Z) - Class Balancing GAN with a Classifier in the Loop [58.29090045399214]
We introduce a novel theoretically motivated Class Balancing regularizer for training GANs.
Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset.
We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets.
arXiv Detail & Related papers (2021-06-17T11:41:30Z) - Bayesian Distributional Policy Gradients [2.28438857884398]
Distributional Reinforcement Learning maintains the entire probability distribution of the reward-to-go, i.e. the return.
Bayesian Distributional Policy Gradients (BDPG) uses adversarial training in joint-contrastive learning to estimate a variational posterior from the returns.
arXiv Detail & Related papers (2021-03-20T23:42:50Z) - Munchausen Reinforcement Learning [50.396037940989146]
bootstrapping is a core mechanism in Reinforcement Learning (RL)
We show that slightly modifying Deep Q-Network (DQN) in that way provides an agent that is competitive with distributional methods on Atari games.
We provide strong theoretical insights on what happens under the hood -- implicit Kullback-Leibler regularization and increase of the action-gap.
arXiv Detail & Related papers (2020-07-28T18:30:23Z) - Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs)
Semi-implicit actor (SIA) powered by a flexible policy distribution.
We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z) - Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad
Samples [67.11669996924671]
We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm.
When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic'
We show that this top-k update' procedure is a generally applicable improvement.
arXiv Detail & Related papers (2020-02-14T19:27:50Z) - Sample-based Distributional Policy Gradient [14.498314462218394]
We propose sample-based distributional policy gradient (SDPG) algorithm for continuous action space control settings.
We show that our algorithm shows better sample efficiency as well as higher reward for most tasks.
We apply SDPG and D4PG to multiple OpenAI Gym environments and observe that our algorithm shows better sample efficiency as well as higher reward for most tasks.
arXiv Detail & Related papers (2020-01-08T17:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.