Renaissance Robot: Optimal Transport Policy Fusion for Learning Diverse
Skills
- URL: http://arxiv.org/abs/2207.00978v1
- Date: Sun, 3 Jul 2022 08:15:41 GMT
- Title: Renaissance Robot: Optimal Transport Policy Fusion for Learning Diverse
Skills
- Authors: Julia Tan, Ransalu Senanayake, Fabio Ramos
- Abstract summary: We propose a post-hoc technique for policy fusion using Optimal Transport theory.
This provides an improved weights initialisation of the neural network policy for learning new tasks.
Our results show that specialised knowledge can be unified into a "Renaissance agent", allowing for quicker learning of new skills.
- Score: 28.39150937658635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep reinforcement learning (RL) is a promising approach to solving complex
robotics problems. However, the process of learning through trial-and-error
interactions is often highly time-consuming, despite recent advancements in RL
algorithms. Additionally, the success of RL is critically dependent on how well
the reward-shaping function suits the task, which is also time-consuming to
design. As agents trained on a variety of robotics problems continue to
proliferate, the ability to reuse their valuable learning for new domains
becomes increasingly significant. In this paper, we propose a post-hoc
technique for policy fusion using Optimal Transport theory as a robust means of
consolidating the knowledge of multiple agents that have been trained on
distinct scenarios. We further demonstrate that this provides an improved
weights initialisation of the neural network policy for learning new tasks,
requiring less time and computational resources than either retraining the
parent policies or training a new policy from scratch. Ultimately, our results
on diverse agents commonly used in deep RL show that specialised knowledge can
be unified into a "Renaissance agent", allowing for quicker learning of new
skills.
Related papers
- ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI [44.77897322913095]
We present ReLIC, a new approach for in-context reinforcement learning for embodied agents.
With ReLIC, agents are capable of adapting to new environments using 64,000 steps of in-context experience.
We find that ReLIC is capable of few-shot imitation learning despite never being trained with expert demonstrations.
arXiv Detail & Related papers (2024-10-03T17:58:11Z) - Hybrid Inverse Reinforcement Learning [34.793570631021005]
inverse reinforcement learning approach to imitation learning is a double-edged sword.
We propose using hybrid RL -- training on a mixture of online and expert data -- to curtail unnecessary exploration.
We derive both model-free and model-based hybrid inverse RL algorithms with strong policy performance guarantees.
arXiv Detail & Related papers (2024-02-13T23:29:09Z) - REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Reinforcement Learning for UAV control with Policy and Reward Shaping [0.7127008801193563]
This study teaches an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously.
The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach.
arXiv Detail & Related papers (2022-12-06T14:46:13Z) - Flexible Attention-Based Multi-Policy Fusion for Efficient Deep
Reinforcement Learning [78.31888150539258]
Reinforcement learning (RL) agents have long sought to approach the efficiency of human learning.
Prior studies in RL have incorporated external knowledge policies to help agents improve sample efficiency.
We present Knowledge-Grounded RL (KGRL), an RL paradigm fusing multiple knowledge policies and aiming for human-like efficiency and flexibility.
arXiv Detail & Related papers (2022-10-07T17:56:57Z) - Learning state correspondence of reinforcement learning tasks for
knowledge transfer [0.0]
Generalizing and reusing knowledge are the fundamental requirements for creating a truly intelligent agent.
This work proposes a general method for one-to-one transfer learning based on generative adversarial network model tailored to RL task.
arXiv Detail & Related papers (2022-09-14T12:42:59Z) - Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.
In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment.
In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Reset-Free Reinforcement Learning via Multi-Task Learning: Learning
Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems.
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.