Cross-domain Imitation from Observations
- URL: http://arxiv.org/abs/2105.10037v1
- Date: Thu, 20 May 2021 21:08:25 GMT
- Title: Cross-domain Imitation from Observations
- Authors: Dripta S. Raychaudhuri, Sujoy Paul, Jeroen van Baar, Amit K.
Roy-Chowdhury
- Abstract summary: Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior.
In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP.
We present a novel framework to learn correspondences across such domains.
- Score: 50.669343548588294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning seeks to circumvent the difficulty in designing proper
reward functions for training agents by utilizing expert behavior. With
environments modeled as Markov Decision Processes (MDP), most of the existing
imitation algorithms are contingent on the availability of expert
demonstrations in the same MDP as the one in which a new imitation policy is to
be learned. In this paper, we study the problem of how to imitate tasks when
there exist discrepancies between the expert and agent MDP. These discrepancies
across domains could include differing dynamics, viewpoint, or morphology; we
present a novel framework to learn correspondences across such domains.
Importantly, in contrast to prior works, we use unpaired and unaligned
trajectories containing only states in the expert domain, to learn this
correspondence. We utilize a cycle-consistency constraint on both the state
space and a domain agnostic latent space to do this. In addition, we enforce
consistency on the temporal position of states via a normalized position
estimator function, to align the trajectories across the two domains. Once this
correspondence is found, we can directly transfer the demonstrations on one
domain to the other and use it for imitation. Experiments across a wide variety
of challenging domains demonstrate the efficacy of our approach.
Related papers
- xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing [21.37585797507323]
Cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning.
We propose the Cross-Domain Trajectory EDiting framework that employs a specially designed diffusion model for cross-domain trajectory adaptation.
Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data.
arXiv Detail & Related papers (2024-09-13T10:07:28Z) - Cross-Domain Policy Transfer by Representation Alignment via Multi-Domain Behavioral Cloning [13.674493608667627]
We present a simple approach for cross-domain policy transfer that learns a shared latent representation across domains and a common abstract policy on top of it.
Our approach leverages multi-domain behavioral cloning on unaligned trajectories of proxy tasks and employs maximum mean discrepancy (MMD) as a regularization term to encourage cross-domain alignment.
arXiv Detail & Related papers (2024-07-24T00:13:00Z) - Cross-Domain Policy Adaptation by Capturing Representation Mismatch [53.087413751430255]
It is vital to learn effective policies that can be transferred to different domains with dynamics discrepancies in reinforcement learning (RL)
In this paper, we consider dynamics adaptation settings where there exists dynamics mismatch between the source domain and the target domain.
We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain.
arXiv Detail & Related papers (2024-05-24T09:06:12Z) - Cross Domain Policy Transfer with Effect Cycle-Consistency [3.3213136251955815]
Training a robotic policy from scratch using deep reinforcement learning methods can be prohibitively expensive due to sample inefficiency.
We propose a novel approach for learning the mapping functions between state and action spaces across domains using unpaired data.
Our approach has been tested on three locomotion tasks and two robotic manipulation tasks.
arXiv Detail & Related papers (2024-03-04T13:20:07Z) - Context-aware Domain Adaptation for Time Series Anomaly Detection [69.3488037353497]
Time series anomaly detection is a challenging task with a wide range of real-world applications.
Recent efforts have been devoted to time series domain adaptation to leverage knowledge from similar domains.
We propose a framework that combines context sampling and anomaly detection into a joint learning procedure.
arXiv Detail & Related papers (2023-04-15T02:28:58Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Learn what matters: cross-domain imitation learning with task-relevant
embeddings [77.34726150561087]
We study how an autonomous agent learns to perform a task from demonstrations in a different domain, such as a different environment or different agent.
We propose a scalable framework that enables cross-domain imitation learning without access to additional demonstrations or further domain knowledge.
arXiv Detail & Related papers (2022-09-24T21:56:58Z) - Transfer Reinforcement Learning for Differing Action Spaces via
Q-Network Representations [2.0625936401496237]
We present a reward shaping method based on source embedding similarity that is applicable to domains with both discrete and continuous action spaces.
The efficacy of our approach is evaluated on transfer to restricted action spaces in the Acrobot-v1 and Pendulum-v0 domains.
arXiv Detail & Related papers (2022-02-05T00:14:05Z) - Improving Transferability of Domain Adaptation Networks Through Domain
Alignment Layers [1.3766148734487902]
Multi-source unsupervised domain adaptation (MSDA) aims at learning a predictor for an unlabeled domain by assigning weak knowledge from a bag of source models.
We propose to embed Multi-Source version of DomaIn Alignment Layers (MS-DIAL) at different levels of the predictor.
Our approach can improve state-of-the-art MSDA methods, yielding relative gains of up to +30.64% on their classification accuracies.
arXiv Detail & Related papers (2021-09-06T18:41:19Z) - Continuous Domain Adaptation with Variational Domain-Agnostic Feature
Replay [78.7472257594881]
Learning in non-stationary environments is one of the biggest challenges in machine learning.
Non-stationarity can be caused by either task drift, or the domain drift.
We propose variational domain-agnostic feature replay, an approach that is composed of three components.
arXiv Detail & Related papers (2020-03-09T19:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.