Deep Reactive Planning in Dynamic Environments
- URL: http://arxiv.org/abs/2011.00155v2
- Date: Thu, 5 Nov 2020 21:31:44 GMT
- Title: Deep Reactive Planning in Dynamic Environments
- Authors: Kei Ota, Devesh K. Jha, Tadashi Onishi, Asako Kanezaki, Yusuke
Yoshiyasu, Yoko Sasaki, Toshisada Mariyama, Daniel Nikovski
- Abstract summary: A robot can learn an end-to-end policy which can adapt to changes in the environment during execution.
We present a method that can achieve such behavior by combining traditional kinematic planning, deep learning, and deep reinforcement learning.
We demonstrate the proposed approach for several reaching and pick-and-place tasks in simulation, as well as on a real system of a 6-DoF industrial manipulator.
- Score: 20.319894237644558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main novelty of the proposed approach is that it allows a robot to learn
an end-to-end policy which can adapt to changes in the environment during
execution. While goal conditioning of policies has been studied in the RL
literature, such approaches are not easily extended to cases where the robot's
goal can change during execution. This is something that humans are naturally
able to do. However, it is difficult for robots to learn such reflexes (i.e.,
to naturally respond to dynamic environments), especially when the goal
location is not explicitly provided to the robot, and instead needs to be
perceived through a vision sensor. In the current work, we present a method
that can achieve such behavior by combining traditional kinematic planning,
deep learning, and deep reinforcement learning in a synergistic fashion to
generalize to arbitrary environments. We demonstrate the proposed approach for
several reaching and pick-and-place tasks in simulation, as well as on a real
system of a 6-DoF industrial manipulator. A video describing our work could be
found \url{https://youtu.be/hE-Ew59GRPQ}.
Related papers
- Grounding Robot Policies with Visuomotor Language Guidance [15.774237279917594]
We propose an agent-based framework for grounding robot policies to the current context.
The proposed framework is composed of a set of conversational agents designed for specific roles.
We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates.
arXiv Detail & Related papers (2024-10-09T02:00:37Z) - Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation [65.46610405509338]
We seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation.
Our framework,Track2Act predicts tracks of how points in an image should move in future time-steps based on a goal.
We show that this approach of combining scalably learned track prediction with a residual policy enables diverse generalizable robot manipulation.
arXiv Detail & Related papers (2024-05-02T17:56:55Z) - Learning Vision-based Pursuit-Evasion Robot Policies [54.52536214251999]
We develop a fully-observable robot policy that generates supervision for a partially-observable one.
We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild.
arXiv Detail & Related papers (2023-08-30T17:59:05Z) - Learning Video-Conditioned Policies for Unseen Manipulation Tasks [83.2240629060453]
Video-conditioned Policy learning maps human demonstrations of previously unseen tasks to robot manipulation skills.
We learn our policy to generate appropriate actions given current scene observations and a video of the target task.
We validate our approach on a set of challenging multi-task robot manipulation environments and outperform state of the art.
arXiv Detail & Related papers (2023-05-10T16:25:42Z) - Quality-Diversity Optimisation on a Physical Robot Through
Dynamics-Aware and Reset-Free Learning [4.260312058817663]
We build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a physical robot.
This method uses a dynamics model, learned from interactions between the robot and the environment, to predict the robot's behaviour.
RF-QD also includes a recovery policy that returns the robot to a safe zone when it has walked outside of it, allowing continuous learning.
arXiv Detail & Related papers (2023-04-24T13:24:00Z) - Error-Aware Policy Learning: Zero-Shot Generalization in Partially
Observable Dynamic Environments [18.8481771211768]
We introduce a novel approach to tackle such a sim-to-real problem by developing policies capable of adapting to new environments.
Key to our approach is an error-aware policy (EAP) that is explicitly made aware of the effect of unobservable factors during training.
We show that a trained EAP for a hip-torque assistive device can be transferred to different human agents with unseen biomechanical characteristics.
arXiv Detail & Related papers (2021-03-13T15:36:44Z) - Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic
Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform.
We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms.
We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z) - ViNG: Learning Open-World Navigation with Visual Goals [82.84193221280216]
We propose a learning-based navigation system for reaching visually indicated goals.
We show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.
We demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection.
arXiv Detail & Related papers (2020-12-17T18:22:32Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.