Domain Adaptation In Reinforcement Learning Via Latent Unified State
Representation
- URL: http://arxiv.org/abs/2102.05714v1
- Date: Wed, 10 Feb 2021 19:38:14 GMT
- Title: Domain Adaptation In Reinforcement Learning Via Latent Unified State
Representation
- Authors: Jinwei Xing, Takashi Nagata, Kexin Chen, Xinyun Zou, Emre Neftci,
Jeffrey L. Krichmar
- Abstract summary: We propose a two-stage RL agent that first learns a latent unified state representation (LUSR) which is consistent across multiple domains in the first stage, and then do RL training in one source domain based on LUSR in the second stage.
Cross-domain consistency of LUSR allows the policy acquired from the source domain to generalize to other target domains without extra training.
Our results show that this approach can achieve state-of-the-art domain adaptation performance in related RL tasks and outperforms prior approaches based on latent-representation based RL and image-to-image translation.
- Score: 1.435381256004719
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the recent success of deep reinforcement learning (RL), domain
adaptation remains an open problem. Although the generalization ability of RL
agents is critical for the real-world applicability of Deep RL, zero-shot
policy transfer is still a challenging problem since even minor visual changes
could make the trained agent completely fail in the new task. To address this
issue, we propose a two-stage RL agent that first learns a latent unified state
representation (LUSR) which is consistent across multiple domains in the first
stage, and then do RL training in one source domain based on LUSR in the second
stage. The cross-domain consistency of LUSR allows the policy acquired from the
source domain to generalize to other target domains without extra training. We
first demonstrate our approach in variants of CarRacing games with customized
manipulations, and then verify it in CARLA, an autonomous driving simulator
with more complex and realistic visual observations. Our results show that this
approach can achieve state-of-the-art domain adaptation performance in related
RL tasks and outperforms prior approaches based on latent-representation based
RL and image-to-image translation.
Related papers
- Cross-Domain Policy Adaptation by Capturing Representation Mismatch [53.087413751430255]
It is vital to learn effective policies that can be transferred to different domains with dynamics discrepancies in reinforcement learning (RL)
In this paper, we consider dynamics adaptation settings where there exists dynamics mismatch between the source domain and the target domain.
We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain.
arXiv Detail & Related papers (2024-05-24T09:06:12Z) - Bridging the Reality Gap of Reinforcement Learning based Traffic Signal
Control using Domain Randomization and Meta Learning [0.7614628596146599]
We present a comprehensive analysis of potential simulation parameters that contribute to this reality gap.
We then examine two promising strategies that can bridge this gap: Domain Randomization (DR) and Model-Agnostic Meta-Learning (MAML)
Our experimental results show that both DR and MAML outperform a state-of-the-art RL algorithm.
arXiv Detail & Related papers (2023-07-21T05:17:21Z) - Human-Timescale Adaptation in an Open-Ended Task Space [56.55530165036327]
We show that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans.
Our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains.
arXiv Detail & Related papers (2023-01-18T15:39:21Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Unified State Representation Learning under Data Augmentation [8.904143080467348]
Generalization of reinforcement learning agents is critical to success in the real world.
We propose USRA: Unified State Representation Learning under Data Augmentation.
We find that USRA achieves higher sample efficiency and 14.3% better domain adaptation performance compared to the best baseline results.
arXiv Detail & Related papers (2022-09-12T15:10:28Z) - Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel.
On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations.
On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z) - POAR: Efficient Policy Optimization via Online Abstract State
Representation Learning [6.171331561029968]
State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states.
We introduce a new SRL prior called domain resemblance to leverage expert demonstration to improve SRL interpretations.
We empirically verify POAR to efficiently handle tasks in high dimensions and facilitate training real-life robots directly from scratch.
arXiv Detail & Related papers (2021-09-17T16:52:03Z) - Domain Adversarial Reinforcement Learning [37.21155002604856]
We consider the problem of generalization in reinforcement learning where visual aspects of the observations might differ.
The performance of the agent is then reported on new unknown test domains drawn from the MDP distribution.
We empirically show that this approach allows achieving a significant generalization improvement to new unseen domains.
arXiv Detail & Related papers (2021-02-14T07:58:41Z) - Off-Dynamics Reinforcement Learning: Training for Transfer with Domain
Classifiers [138.68213707587822]
We propose a simple, practical, and intuitive approach for domain adaptation in reinforcement learning.
We show that we can achieve this goal by compensating for the difference in dynamics by modifying the reward function.
Our approach is applicable to domains with continuous states and actions and does not require learning an explicit model of the dynamics.
arXiv Detail & Related papers (2020-06-24T17:47:37Z) - Learn to Interpret Atari Agents [106.21468537372995]
Region-sensitive Rainbow (RS-Rainbow) is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
Our proposed agent, named region-sensitive Rainbow (RS-Rainbow), is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
arXiv Detail & Related papers (2018-12-29T03:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.