Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning
- URL: http://arxiv.org/abs/2406.12499v1
- Date: Tue, 18 Jun 2024 11:00:55 GMT
- Title: Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning
- Authors: Harry Robertshaw, Lennart Karstensen, Benjamin Jackson, Alejandro Granados, Thomas C. Booth,
- Abstract summary: Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure.
This study explores the viability of autonomous navigation in MT vasculature using inverse RL (IRL) to leverage expert demonstrations.
- Score: 39.70065117918227
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Purpose: Autonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous navigation in MT vasculature using inverse RL (IRL) to leverage expert demonstrations. Methods: This study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized soft actor-critic to train models with various reward functions and compared their performance in silico. Results: We demonstrated feasibility of navigation using IRL. When evaluating single versus dual device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual-tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through reward shaping. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s). Conclusions: We have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by employing IRL. The results underscore the potential of using reward shaping to train models, offering a promising avenue for enhancing the accessibility and precision of MT. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.
Related papers
- A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire
Navigation [0.0]
The treatment of cardiovascular diseases requires complex and challenging navigation of a guidewire and catheter.
This often leads to lengthy interventions during which the patient and clinician are exposed to X-ray radiation.
Deep Reinforcement Learning approaches have shown promise in learning this task and may be the key to automating catheter navigation during robotized interventions.
arXiv Detail & Related papers (2024-03-05T08:46:54Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents.
We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead.
Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z) - survAIval: Survival Analysis with the Eyes of AI [0.6445605125467573]
We propose a novel approach to enrich the training data for automated driving by using a self-designed driving simulator and two human drivers.
Our results show that incorporating these corner cases during training improves the recognition of corner cases during testing.
arXiv Detail & Related papers (2023-05-23T15:20:31Z) - Demonstration-Guided Reinforcement Learning with Efficient Exploration
for Task Automation of Surgical Robot [54.80144694888735]
We introduce Demonstration-guided EXploration (DEX), an efficient reinforcement learning algorithm.
Our method estimates expert-like behaviors with higher values to facilitate productive interactions.
Experiments on $10$ surgical manipulation tasks from SurRoL, a comprehensive surgical simulation platform, demonstrate significant improvements.
arXiv Detail & Related papers (2023-02-20T05:38:54Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Deep Reinforcement Learning for Continuous Docking Control of Autonomous
Underwater Vehicles: A Benchmarking Study [1.7403133838762446]
This work explores the application of state-of-the-art model-free deep reinforcement learning approaches to the task of AUV docking in the continuous domain.
We provide a detailed formulation of the reward function, utilized to successfully dock the AUV onto a fixed docking platform.
arXiv Detail & Related papers (2021-08-05T14:58:05Z) - Deep Reinforcement Learning with a Stage Incentive Mechanism of Dense
Reward for Robotic Trajectory Planning [3.0242753679068466]
We present three dense reward functions to improve the efficiency of DRL-based methods for robot manipulator trajectory planning.
A posture reward function is proposed to speed up the learning process with a more reasonable trajectory.
A stride reward function is proposed to improve the stability of the learning process.
arXiv Detail & Related papers (2020-09-25T07:36:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.