Related papers: Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation

Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation

URL: http://arxiv.org/abs/2102.05714v1
Date: Wed, 10 Feb 2021 19:38:14 GMT
Title: Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation
Authors: Jinwei Xing, Takashi Nagata, Kexin Chen, Xinyun Zou, Emre Neftci, Jeffrey L. Krichmar
Abstract summary: We propose a two-stage RL agent that first learns a latent unified state representation (LUSR) which is consistent across multiple domains in the first stage, and then do RL training in one source domain based on LUSR in the second stage. Cross-domain consistency of LUSR allows the policy acquired from the source domain to generalize to other target domains without extra training. Our results show that this approach can achieve state-of-the-art domain adaptation performance in related RL tasks and outperforms prior approaches based on latent-representation based RL and image-to-image translation.
Score: 1.435381256004719
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the recent success of deep reinforcement learning (RL), domain adaptation remains an open problem. Although the generalization ability of RL agents is critical for the real-world applicability of Deep RL, zero-shot policy transfer is still a challenging problem since even minor visual changes could make the trained agent completely fail in the new task. To address this issue, we propose a two-stage RL agent that first learns a latent unified state representation (LUSR) which is consistent across multiple domains in the first stage, and then do RL training in one source domain based on LUSR in the second stage. The cross-domain consistency of LUSR allows the policy acquired from the source domain to generalize to other target domains without extra training. We first demonstrate our approach in variants of CarRacing games with customized manipulations, and then verify it in CARLA, an autonomous driving simulator with more complex and realistic visual observations. Our results show that this approach can achieve state-of-the-art domain adaptation performance in related RL tasks and outperforms prior approaches based on latent-representation based RL and image-to-image translation.

Related papers

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective [82.24301452333577]
Reinforcement learning (RL) has emerged as a promising approach to improve large language model (LLM) reasoning.<n>A key challenge lies in the lack of reliable, scalable RL reward signals across diverse reasoning domains.<n>We introduce Guru, a curated RL reasoning corpus of 92K verifiable examples spanning six reasoning domains.
arXiv Detail & Related papers (2025-06-17T20:24:00Z)
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning [125.65034908728828]
Training large language models (LLMs) as interactive agents presents unique challenges. While reinforcement learning has enabled progress in static tasks, multi-turn agent RL training remains underexplored. We propose StarPO, a general framework for trajectory-level agent RL, and introduce RAGEN, a modular system for training and evaluating LLM agents.
arXiv Detail & Related papers (2025-04-24T17:57:08Z)
Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning [26.915055027485465]
We study offline off-dynamics reinforcement learning (RL) to enhance policy learning in a target domain with limited data. Our approach centers on return-conditioned supervised learning (RCSL), particularly focusing on the decision transformer (DT) We propose the Return Augmented Decision Transformer (RADT) method, where we augment the return in the source domain by aligning its distribution with that in the target domain.
arXiv Detail & Related papers (2024-10-30T20:46:26Z)
Cross-Domain Policy Adaptation by Capturing Representation Mismatch [53.087413751430255]
It is vital to learn effective policies that can be transferred to different domains with dynamics discrepancies in reinforcement learning (RL) In this paper, we consider dynamics adaptation settings where there exists dynamics mismatch between the source domain and the target domain. We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain.
arXiv Detail & Related papers (2024-05-24T09:06:12Z)
Bridging the Reality Gap of Reinforcement Learning based Traffic Signal Control using Domain Randomization and Meta Learning [0.7614628596146599]
We present a comprehensive analysis of potential simulation parameters that contribute to this reality gap. We then examine two promising strategies that can bridge this gap: Domain Randomization (DR) and Model-Agnostic Meta-Learning (MAML) Our experimental results show that both DR and MAML outperform a state-of-the-art RL algorithm.
arXiv Detail & Related papers (2023-07-21T05:17:21Z)
Human-Timescale Adaptation in an Open-Ended Task Space [56.55530165036327]
We show that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans. Our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains.
arXiv Detail & Related papers (2023-01-18T15:39:21Z)
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment. We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent. We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z)
Unified State Representation Learning under Data Augmentation [8.904143080467348]
Generalization of reinforcement learning agents is critical to success in the real world. We propose USRA: Unified State Representation Learning under Data Augmentation. We find that USRA achieves higher sample efficiency and 14.3% better domain adaptation performance compared to the best baseline results.
arXiv Detail & Related papers (2022-09-12T15:10:28Z)
Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel. On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z)
POAR: Efficient Policy Optimization via Online Abstract State Representation Learning [6.171331561029968]
State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states. We introduce a new SRL prior called domain resemblance to leverage expert demonstration to improve SRL interpretations. We empirically verify POAR to efficiently handle tasks in high dimensions and facilitate training real-life robots directly from scratch.
arXiv Detail & Related papers (2021-09-17T16:52:03Z)
Domain Adversarial Reinforcement Learning [37.21155002604856]
We consider the problem of generalization in reinforcement learning where visual aspects of the observations might differ. The performance of the agent is then reported on new unknown test domains drawn from the MDP distribution. We empirically show that this approach allows achieving a significant generalization improvement to new unseen domains.
arXiv Detail & Related papers (2021-02-14T07:58:41Z)
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers [138.68213707587822]
We propose a simple, practical, and intuitive approach for domain adaptation in reinforcement learning. We show that we can achieve this goal by compensating for the difference in dynamics by modifying the reward function. Our approach is applicable to domains with continuous states and actions and does not require learning an explicit model of the dynamics.
arXiv Detail & Related papers (2020-06-24T17:47:37Z)
Learn to Interpret Atari Agents [106.21468537372995]
Region-sensitive Rainbow (RS-Rainbow) is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent. Our proposed agent, named region-sensitive Rainbow (RS-Rainbow), is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
arXiv Detail & Related papers (2018-12-29T03:35:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.