Off-Dynamics Reinforcement Learning: Training for Transfer with Domain
Classifiers
- URL: http://arxiv.org/abs/2006.13916v2
- Date: Wed, 14 Apr 2021 23:38:31 GMT
- Title: Off-Dynamics Reinforcement Learning: Training for Transfer with Domain
Classifiers
- Authors: Benjamin Eysenbach, Swapnil Asawa, Shreyas Chaudhari, Sergey Levine,
Ruslan Salakhutdinov
- Abstract summary: We propose a simple, practical, and intuitive approach for domain adaptation in reinforcement learning.
We show that we can achieve this goal by compensating for the difference in dynamics by modifying the reward function.
Our approach is applicable to domains with continuous states and actions and does not require learning an explicit model of the dynamics.
- Score: 138.68213707587822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a simple, practical, and intuitive approach for domain adaptation
in reinforcement learning. Our approach stems from the idea that the agent's
experience in the source domain should look similar to its experience in the
target domain. Building off of a probabilistic view of RL, we formally show
that we can achieve this goal by compensating for the difference in dynamics by
modifying the reward function. This modified reward function is simple to
estimate by learning auxiliary classifiers that distinguish source-domain
transitions from target-domain transitions. Intuitively, the modified reward
function penalizes the agent for visiting states and taking actions in the
source domain which are not possible in the target domain. Said another way,
the agent is penalized for transitions that would indicate that the agent is
interacting with the source domain, rather than the target domain. Our approach
is applicable to domains with continuous states and actions and does not
require learning an explicit model of the dynamics. On discrete and continuous
control tasks, we illustrate the mechanics of our approach and demonstrate its
scalability to high-dimensional tasks.
Related papers
- xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing [21.37585797507323]
Cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning.
We propose the Cross-Domain Trajectory EDiting framework that employs a specially designed diffusion model for cross-domain trajectory adaptation.
Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data.
arXiv Detail & Related papers (2024-09-13T10:07:28Z) - Cross-Domain Policy Adaptation by Capturing Representation Mismatch [53.087413751430255]
It is vital to learn effective policies that can be transferred to different domains with dynamics discrepancies in reinforcement learning (RL)
In this paper, we consider dynamics adaptation settings where there exists dynamics mismatch between the source domain and the target domain.
We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain.
arXiv Detail & Related papers (2024-05-24T09:06:12Z) - Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z) - Meta-causal Learning for Single Domain Generalization [102.53303707563612]
Single domain generalization aims to learn a model from a single training domain (source domain) and apply it to multiple unseen test domains (target domains)
Existing methods focus on expanding the distribution of the training domain to cover the target domains, but without estimating the domain shift between the source and target domains.
We propose a new learning paradigm, namely simulate-analyze-reduce, which first simulates the domain shift by building an auxiliary domain as the target domain, then learns to analyze the causes of domain shift, and finally learns to reduce the domain shift for model adaptation.
arXiv Detail & Related papers (2023-04-07T15:46:38Z) - Variational Transfer Learning using Cross-Domain Latent Modulation [1.9662978733004601]
We introduce a novel cross-domain latent modulation mechanism to a variational autoencoder framework so as to achieve effective transfer learning.
Deep representations of the source and target domains are first extracted by a unified inference model and aligned by employing gradient reversal.
The learned deep representations are then cross-modulated to the latent encoding of the alternative domain, where consistency constraints are also applied.
arXiv Detail & Related papers (2022-05-31T03:47:08Z) - Multilevel Knowledge Transfer for Cross-Domain Object Detection [26.105283273950942]
Domain shift is a well known problem where a model trained on a particular domain (source) does not perform well when exposed to samples from a different domain (target)
In this work, we address the domain shift problem for the object detection task.
Our approach relies on gradually removing the domain shift between the source and the target domains.
arXiv Detail & Related papers (2021-08-02T15:24:40Z) - AFAN: Augmented Feature Alignment Network for Cross-Domain Object
Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications.
We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training.
Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z) - Interventional Domain Adaptation [81.0692660794765]
Domain adaptation (DA) aims to transfer discriminative features learned from source domain to target domain.
Standard domain-invariance learning suffers from spurious correlations and incorrectly transfers the source-specifics.
We create counterfactual features that distinguish the domain-specifics from domain-sharable part.
arXiv Detail & Related papers (2020-11-07T09:53:13Z) - Contradistinguisher: A Vapnik's Imperative to Unsupervised Domain
Adaptation [7.538482310185133]
We propose a model referred Contradistinguisher that learns contrastive features and whose objective is to jointly learn to contradistinguish the unlabeled target domain in an unsupervised way.
We achieve the state-of-the-art on Office-31 and VisDA-2017 datasets in both single-source and multi-source settings.
arXiv Detail & Related papers (2020-05-25T19:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.