Robotic Surgery With Lean Reinforcement Learning
- URL: http://arxiv.org/abs/2105.01006v1
- Date: Mon, 3 May 2021 16:52:26 GMT
- Title: Robotic Surgery With Lean Reinforcement Learning
- Authors: Yotam Barnoy, Molly O'Brien, Will Wang, Gregory Hager
- Abstract summary: We describe adding reinforcement learning support to the da Vinci Skill Simulator.
We teach an RL-based agent to perform sub-tasks in the simulator environment, using either image or state data.
We tackle the sample inefficiency of RL using a simple-to-implement system which we term hybrid-batch learning (HBL)
- Score: 0.8258451067861933
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As surgical robots become more common, automating away some of the burden of
complex direct human operation becomes ever more feasible. Model-free
reinforcement learning (RL) is a promising direction toward generalizable
automated surgical performance, but progress has been slowed by the lack of
efficient and realistic learning environments. In this paper, we describe
adding reinforcement learning support to the da Vinci Skill Simulator, a
training simulation used around the world to allow surgeons to learn and
rehearse technical skills. We successfully teach an RL-based agent to perform
sub-tasks in the simulator environment, using either image or state data. As
far as we know, this is the first time an RL-based agent is taught from visual
data in a surgical robotics environment. Additionally, we tackle the sample
inefficiency of RL using a simple-to-implement system which we term
hybrid-batch learning (HBL), effectively adding a second, long-term replay
buffer to the Q-learning process. Additionally, this allows us to bootstrap
learning from images from the data collected using the easier task of learning
from state. We show that HBL decreases our learning times significantly.
Related papers
- Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning [47.785786984974855]
We present a human-in-the-loop vision-based RL system that demonstrates impressive performance on a diverse set of dexterous manipulation tasks.
Our approach integrates demonstrations and human corrections, efficient RL algorithms, and other system-level design choices to learn policies.
We show that our method significantly outperforms imitation learning baselines and prior RL approaches, with an average 2x improvement in success rate and 1.8x faster execution.
arXiv Detail & Related papers (2024-10-29T08:12:20Z) - REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Real World Offline Reinforcement Learning with Realistic Data Source [33.7474988142367]
offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience.
Current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories.
In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning.
arXiv Detail & Related papers (2022-10-12T17:57:05Z) - Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.
In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment.
In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - SurRoL: An Open-source Reinforcement Learning Centered and dVRK
Compatible Platform for Surgical Robot Learning [78.76052604441519]
SurRoL is an RL-centered simulation platform for surgical robot learning compatible with the da Vinci Research Kit (dVRK)
Ten learning-based surgical tasks are built in the platform, which are common in the real autonomous surgical execution.
We evaluate SurRoL using RL algorithms in simulation, provide in-depth analysis, deploy the trained policies on the real dVRK, and show that our SurRoL achieves better transferability in the real world.
arXiv Detail & Related papers (2021-08-30T07:43:47Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks [70.56451186797436]
We study how to use meta-reinforcement learning to solve the bulk of the problem in simulation.
We demonstrate our approach by training an agent to successfully perform challenging real-world insertion tasks.
arXiv Detail & Related papers (2020-04-29T18:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.