Transfer Reinforcement Learning for Differing Action Spaces via
Q-Network Representations
- URL: http://arxiv.org/abs/2202.02442v1
- Date: Sat, 5 Feb 2022 00:14:05 GMT
- Title: Transfer Reinforcement Learning for Differing Action Spaces via
Q-Network Representations
- Authors: Nathan Beck, Abhiramon Rajasekharan, Trung Hieu Tran
- Abstract summary: We present a reward shaping method based on source embedding similarity that is applicable to domains with both discrete and continuous action spaces.
The efficacy of our approach is evaluated on transfer to restricted action spaces in the Acrobot-v1 and Pendulum-v0 domains.
- Score: 2.0625936401496237
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning approaches in reinforcement learning aim to assist agents
in learning their target domains by leveraging the knowledge learned from other
agents that have been trained on similar source domains. For example, recent
research focus within this space has been placed on knowledge transfer between
tasks that have different transition dynamics and reward functions; however,
little focus has been placed on knowledge transfer between tasks that have
different action spaces. In this paper, we approach the task of transfer
learning between domains that differ in action spaces. We present a reward
shaping method based on source embedding similarity that is applicable to
domains with both discrete and continuous action spaces. The efficacy of our
approach is evaluated on transfer to restricted action spaces in the Acrobot-v1
and Pendulum-v0 domains (Brockman et al. 2016). A comparison with two baselines
shows that our method does not outperform these baselines in these continuous
action spaces but does show an improvement in these discrete action spaces. We
conclude our analysis with future directions for this work.
Related papers
- Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces [52.649077293256795]
Continual offline reinforcement learning (CORL) has shown impressive ability in diffusion-based lifelong learning systems.
We propose Vector-Quantized Continual diffuser, named VQ-CD, to break the barrier of different spaces between various tasks.
arXiv Detail & Related papers (2024-10-21T07:13:45Z) - Cross Domain Policy Transfer with Effect Cycle-Consistency [3.3213136251955815]
Training a robotic policy from scratch using deep reinforcement learning methods can be prohibitively expensive due to sample inefficiency.
We propose a novel approach for learning the mapping functions between state and action spaces across domains using unpaired data.
Our approach has been tested on three locomotion tasks and two robotic manipulation tasks.
arXiv Detail & Related papers (2024-03-04T13:20:07Z) - A Recent Survey of Heterogeneous Transfer Learning [15.830786437956144]
heterogeneous transfer learning has become a vital strategy in various tasks.
We offer an extensive review of over 60 HTL methods, covering both data-based and model-based approaches.
We explore applications in natural language processing, computer vision, multimodal learning, and biomedicine.
arXiv Detail & Related papers (2023-10-12T16:19:58Z) - From Patches to Objects: Exploiting Spatial Reasoning for Better Visual
Representations [2.363388546004777]
We propose a novel auxiliary pretraining method that is based on spatial reasoning.
Our proposed method takes advantage of a more flexible formulation of contrastive learning by introducing spatial reasoning as an auxiliary task for discriminative self-supervised methods.
arXiv Detail & Related papers (2023-05-21T07:46:46Z) - Transfer RL via the Undo Maps Formalism [29.798971172941627]
Transferring knowledge across domains is one of the most fundamental problems in machine learning.
We propose TvD: transfer via distribution matching, a framework to transfer knowledge across interactive domains.
We show this objective leads to a policy update scheme reminiscent of imitation learning, and derive an efficient algorithm to implement it.
arXiv Detail & Related papers (2022-11-26T03:44:28Z) - Learn what matters: cross-domain imitation learning with task-relevant
embeddings [77.34726150561087]
We study how an autonomous agent learns to perform a task from demonstrations in a different domain, such as a different environment or different agent.
We propose a scalable framework that enables cross-domain imitation learning without access to additional demonstrations or further domain knowledge.
arXiv Detail & Related papers (2022-09-24T21:56:58Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Disentangling Transfer and Interference in Multi-Domain Learning [53.34444188552444]
We study the conditions where interference and knowledge transfer occur in multi-domain learning.
We propose new metrics disentangling interference and transfer and set up experimental protocols.
We demonstrate our findings on the CIFAR-100, MiniPlaces, and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2021-07-02T01:30:36Z) - Cross-domain Imitation from Observations [50.669343548588294]
Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior.
In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP.
We present a novel framework to learn correspondences across such domains.
arXiv Detail & Related papers (2021-05-20T21:08:25Z) - Off-Dynamics Reinforcement Learning: Training for Transfer with Domain
Classifiers [138.68213707587822]
We propose a simple, practical, and intuitive approach for domain adaptation in reinforcement learning.
We show that we can achieve this goal by compensating for the difference in dynamics by modifying the reward function.
Our approach is applicable to domains with continuous states and actions and does not require learning an explicit model of the dynamics.
arXiv Detail & Related papers (2020-06-24T17:47:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.