Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
- URL: http://arxiv.org/abs/2409.15688v1
- Date: Tue, 24 Sep 2024 03:01:30 GMT
- Title: Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
- Authors: Min Tan, Yushun Tao, Boyun Zheng, GaoSheng Xie, Lijuan Feng, Zeyang Xia, Jing Xiong,
- Abstract summary: We propose a Human Intervention (HI)-based Proximal Policy Optimization framework, dubbed HI-PPO, to enhance RDE's safety.
We introduce an Enhanced Exploration Mechanism (EEM) to address the low exploration efficiency of the standard PPO.
We also introduce a reward-penalty adjustment (RPA) to penalize unsafe actions during initial interventions.
- Score: 5.520042381826271
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the increasing application of automated robotic digestive endoscopy (RDE), ensuring safe and efficient navigation in the unstructured and narrow digestive tract has become a critical challenge. Existing automated reinforcement learning navigation algorithms, often result in potentially risky collisions due to the absence of essential human intervention, which significantly limits the safety and effectiveness of RDE in actual clinical practice. To address this limitation, we proposed a Human Intervention (HI)-based Proximal Policy Optimization (PPO) framework, dubbed HI-PPO, which incorporates expert knowledge to enhance RDE's safety. Specifically, we introduce an Enhanced Exploration Mechanism (EEM) to address the low exploration efficiency of the standard PPO. Additionally, a reward-penalty adjustment (RPA) is implemented to penalize unsafe actions during initial interventions. Furthermore, Behavior Cloning Similarity (BCS) is included as an auxiliary objective to ensure the agent emulates expert actions. Comparative experiments conducted in a simulated platform across various anatomical colon segments demonstrate that our model effectively and safely guides RDE.
Related papers
- ADAPT: A Game-Theoretic and Neuro-Symbolic Framework for Automated Distributed Adaptive Penetration Testing [13.101825065498552]
The integration of AI into modern critical infrastructure systems, such as healthcare, has introduced new vulnerabilities.
ADAPT is a game-theoretic and neuro-symbolic framework for automated distributed adaptive penetration testing.
arXiv Detail & Related papers (2024-10-31T21:32:17Z) - Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning [0.4218593777811082]
This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework.
In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance.
arXiv Detail & Related papers (2024-09-16T18:49:38Z) - RAISE -- Radiology AI Safety, an End-to-end lifecycle approach [5.829180249228172]
The integration of AI into radiology introduces opportunities for improved clinical care provision and efficiency.
The focus should be on ensuring models meet the highest standards of safety, effectiveness and efficacy.
The roadmap presented herein aims to expedite the achievement of deployable, reliable, and safe AI in radiology.
arXiv Detail & Related papers (2023-11-24T15:59:14Z) - Provably Efficient Iterated CVaR Reinforcement Learning with Function
Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk.
We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations.
We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z) - Safe Deep RL for Intraoperative Planning of Pedicle Screw Placement [61.28459114068828]
We propose an intraoperative planning approach for robotic spine surgery that leverages real-time observation for drill path planning based on Safe Deep Reinforcement Learning (DRL)
Our approach was capable of achieving 90% bone penetration with respect to the gold standard (GS) drill planning.
arXiv Detail & Related papers (2023-05-09T11:42:53Z) - Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation [78.17108227614928]
We propose a benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation.
We consider a value-based and policy-gradient Deep Reinforcement Learning (DRL)
We also propose a verification strategy that checks the behavior of the trained models over a set of desired properties.
arXiv Detail & Related papers (2021-12-16T16:53:56Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - A Novel Sample-efficient Deep Reinforcement Learning with Episodic
Policy Transfer for PID-Based Control in Cardiac Catheterization Robots [2.3939470784308914]
The model was validated for axial motion control of a robotic system designed for intravascular catheterization.
Performance comparison with conventional methods in average of 10 trials shows the agent tunes the gain better with error of 0.003 mm.
arXiv Detail & Related papers (2021-10-28T08:18:01Z) - Risk-Sensitive Sequential Action Control with Multi-Modal Human
Trajectory Forecasting for Safe Crowd-Robot Interaction [55.569050872780224]
We present an online framework for safe crowd-robot interaction based on risk-sensitive optimal control, wherein the risk is modeled by the entropic risk measure.
Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control.
A simulation study and a real-world experiment show that the proposed framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
arXiv Detail & Related papers (2020-09-12T02:02:52Z) - Learning for Dose Allocation in Adaptive Clinical Trials with Safety
Constraints [84.09488581365484]
Phase I dose-finding trials are increasingly challenging as the relationship between efficacy and toxicity of new compounds becomes more complex.
Most commonly used methods in practice focus on identifying a Maximum Tolerated Dose (MTD) by learning only from toxicity events.
We present a novel adaptive clinical trial methodology that aims at maximizing the cumulative efficacies while satisfying the toxicity safety constraint with high probability.
arXiv Detail & Related papers (2020-06-09T03:06:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.