Cross Domain Policy Transfer with Effect Cycle-Consistency
- URL: http://arxiv.org/abs/2403.02018v1
- Date: Mon, 4 Mar 2024 13:20:07 GMT
- Title: Cross Domain Policy Transfer with Effect Cycle-Consistency
- Authors: Ruiqi Zhu, Tianhong Dai, Oya Celiktutan
- Abstract summary: Training a robotic policy from scratch using deep reinforcement learning methods can be prohibitively expensive due to sample inefficiency.
We propose a novel approach for learning the mapping functions between state and action spaces across domains using unpaired data.
Our approach has been tested on three locomotion tasks and two robotic manipulation tasks.
- Score: 3.3213136251955815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training a robotic policy from scratch using deep reinforcement learning
methods can be prohibitively expensive due to sample inefficiency. To address
this challenge, transferring policies trained in the source domain to the
target domain becomes an attractive paradigm. Previous research has typically
focused on domains with similar state and action spaces but differing in other
aspects. In this paper, our primary focus lies in domains with different state
and action spaces, which has broader practical implications, i.e. transfer the
policy from robot A to robot B. Unlike prior methods that rely on paired data,
we propose a novel approach for learning the mapping functions between state
and action spaces across domains using unpaired data. We propose effect cycle
consistency, which aligns the effects of transitions across two domains through
a symmetrical optimization structure for learning these mapping functions. Once
the mapping functions are learned, we can seamlessly transfer the policy from
the source domain to the target domain. Our approach has been tested on three
locomotion tasks and two robotic manipulation tasks. The empirical results
demonstrate that our method can reduce alignment errors significantly and
achieve better performance compared to the state-of-the-art method.
Related papers
- xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing [21.37585797507323]
Cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning.
We propose the Cross-Domain Trajectory EDiting framework that employs a specially designed diffusion model for cross-domain trajectory adaptation.
Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data.
arXiv Detail & Related papers (2024-09-13T10:07:28Z) - Cross-Domain Policy Adaptation by Capturing Representation Mismatch [53.087413751430255]
It is vital to learn effective policies that can be transferred to different domains with dynamics discrepancies in reinforcement learning (RL)
In this paper, we consider dynamics adaptation settings where there exists dynamics mismatch between the source domain and the target domain.
We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain.
arXiv Detail & Related papers (2024-05-24T09:06:12Z) - A Framework for Few-Shot Policy Transfer through Observation Mapping and
Behavior Cloning [6.048526012097133]
This work proposes a framework for Few-Shot Policy Transfer between two domains through Observation Mapping and Behavior Cloning.
We use Generative Adversarial Networks (GANs) along with a cycle-consistency loss to map the observations between the source and target domains and later use this learned mapping to clone the successful source task behavior policy to the target domain.
arXiv Detail & Related papers (2023-10-13T03:15:42Z) - Cross-Domain Policy Adaptation via Value-Guided Data Filtering [57.62692881606099]
Generalizing policies across different domains with dynamics mismatch poses a significant challenge in reinforcement learning.
We present the Value-Guided Data Filtering (VGDF) algorithm, which selectively shares transitions from the source domain based on the proximity of paired value targets.
arXiv Detail & Related papers (2023-05-28T04:08:40Z) - Transfer RL via the Undo Maps Formalism [29.798971172941627]
Transferring knowledge across domains is one of the most fundamental problems in machine learning.
We propose TvD: transfer via distribution matching, a framework to transfer knowledge across interactive domains.
We show this objective leads to a policy update scheme reminiscent of imitation learning, and derive an efficient algorithm to implement it.
arXiv Detail & Related papers (2022-11-26T03:44:28Z) - Cross-domain Imitation from Observations [50.669343548588294]
Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior.
In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP.
We present a novel framework to learn correspondences across such domains.
arXiv Detail & Related papers (2021-05-20T21:08:25Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - Gradient Regularized Contrastive Learning for Continual Domain
Adaptation [86.02012896014095]
We study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains.
We propose Gradient Regularized Contrastive Learning (GRCL) to solve the obstacles.
Experiments on Digits, DomainNet and Office-Caltech benchmarks demonstrate the strong performance of our approach.
arXiv Detail & Related papers (2021-03-23T04:10:42Z) - Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining
and Consistency [93.89773386634717]
Visual domain adaptation involves learning to classify images from a target visual domain using labels available in a different source domain.
We show that in the presence of a few target labels, simple techniques like self-supervision (via rotation prediction) and consistency regularization can be effective without any adversarial alignment to learn a good target classifier.
Our Pretraining and Consistency (PAC) approach, can achieve state of the art accuracy on this semi-supervised domain adaptation task, surpassing multiple adversarial domain alignment methods, across multiple datasets.
arXiv Detail & Related papers (2021-01-29T18:40:17Z) - Missing-Class-Robust Domain Adaptation by Unilateral Alignment for Fault
Diagnosis [3.786700931138978]
Domain adaptation aims at improving model performance by leveraging the learned knowledge in the source domain and transferring it to the target domain.
Recently, domain adversarial methods have been particularly successful in alleviating the distribution shift between the source and the target domains.
We demonstrate in this paper that the performance of domain adversarial methods can be vulnerable to an incomplete target label space during training.
arXiv Detail & Related papers (2020-01-07T13:19:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.