Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
- URL: http://arxiv.org/abs/2409.15688v1
- Date: Tue, 24 Sep 2024 03:01:30 GMT
- Title: Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
- Authors: Min Tan, Yushun Tao, Boyun Zheng, GaoSheng Xie, Lijuan Feng, Zeyang Xia, Jing Xiong,
- Abstract summary: We propose a Human Intervention (HI)-based Proximal Policy Optimization framework, dubbed HI-PPO, to enhance RDE's safety.
We introduce an Enhanced Exploration Mechanism (EEM) to address the low exploration efficiency of the standard PPO.
We also introduce a reward-penalty adjustment (RPA) to penalize unsafe actions during initial interventions.
- Score: 5.520042381826271
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the increasing application of automated robotic digestive endoscopy (RDE), ensuring safe and efficient navigation in the unstructured and narrow digestive tract has become a critical challenge. Existing automated reinforcement learning navigation algorithms, often result in potentially risky collisions due to the absence of essential human intervention, which significantly limits the safety and effectiveness of RDE in actual clinical practice. To address this limitation, we proposed a Human Intervention (HI)-based Proximal Policy Optimization (PPO) framework, dubbed HI-PPO, which incorporates expert knowledge to enhance RDE's safety. Specifically, we introduce an Enhanced Exploration Mechanism (EEM) to address the low exploration efficiency of the standard PPO. Additionally, a reward-penalty adjustment (RPA) is implemented to penalize unsafe actions during initial interventions. Furthermore, Behavior Cloning Similarity (BCS) is included as an auxiliary objective to ensure the agent emulates expert actions. Comparative experiments conducted in a simulated platform across various anatomical colon segments demonstrate that our model effectively and safely guides RDE.
Related papers
- ETSM: Automating Dissection Trajectory Suggestion and Confidence Map-Based Safety Margin Prediction for Robot-assisted Endoscopic Submucosal Dissection [10.2380174289706]
We create the ESD Trajectory and Confidence Map-based Safety (ETSM) dataset with $1849$ short clips, focusing on submucosal dissection with a dual-arm robotic system.
We also introduce a framework that combines optimal dissection trajectory prediction with a confidence map-based safety margin.
Our approach bridges gaps in current research by improving prediction accuracy and enhancing the safety of the dissection process.
arXiv Detail & Related papers (2024-11-28T03:19:18Z) - Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction [71.81851971324187]
This work introduces Hierarchical Preference Optimization (HPO), a novel approach to hierarchical reinforcement learning (HRL)
HPO addresses non-stationarity and infeasible subgoal generation issues when solving complex robotic control tasks.
Experiments on challenging robotic navigation and manipulation tasks demonstrate impressive performance of HPO, where it shows an improvement of up to 35% over the baselines.
arXiv Detail & Related papers (2024-11-01T04:58:40Z) - ADAPT: A Game-Theoretic and Neuro-Symbolic Framework for Automated Distributed Adaptive Penetration Testing [13.101825065498552]
The integration of AI into modern critical infrastructure systems, such as healthcare, has introduced new vulnerabilities.
ADAPT is a game-theoretic and neuro-symbolic framework for automated distributed adaptive penetration testing.
arXiv Detail & Related papers (2024-10-31T21:32:17Z) - Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning [0.4218593777811082]
This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework.
In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance.
arXiv Detail & Related papers (2024-09-16T18:49:38Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - RAISE -- Radiology AI Safety, an End-to-end lifecycle approach [5.829180249228172]
The integration of AI into radiology introduces opportunities for improved clinical care provision and efficiency.
The focus should be on ensuring models meet the highest standards of safety, effectiveness and efficacy.
The roadmap presented herein aims to expedite the achievement of deployable, reliable, and safe AI in radiology.
arXiv Detail & Related papers (2023-11-24T15:59:14Z) - Provably Efficient Iterated CVaR Reinforcement Learning with Function
Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk.
We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations.
We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z) - Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation [72.24964965882783]
Reinforcement learning (RL) is a promising approach for robotic navigation, allowing robots to learn through trial and error.
Real-world robotic tasks often suffer from sparse rewards, leading to inefficient exploration and suboptimal policies.
We introduce Confidence-Controlled Exploration (CCE), a novel method that improves sample efficiency in RL-based robotic navigation without modifying the reward function.
arXiv Detail & Related papers (2023-06-09T18:45:15Z) - Merging Deep Learning with Expert Knowledge for Seizure Onset Zone
localization from rs-fMRI in Pediatric Pharmaco Resistant Epilepsy [7.087237546722617]
Seizure Onset Zones (SOZs) at an early age is an effective treatment for Pharmaco-Resistant Epilepsy (PRE)
Pre-surgical localization of SOZs with intra-cranial EEG (iEEG) requires safe and effective depth electrode placement.
DeepXSOZ is an expert-in-the-loop IC sorting technique that a) can be configured to either significantly reduce expert sorting workload or operate with high sensitivity based on expertise of the surgical team and b) can potentially enable the usage of rs-fMRI as a low cost outpatient pre-surgical screening tool.
arXiv Detail & Related papers (2023-06-08T22:07:48Z) - Safe Deep RL for Intraoperative Planning of Pedicle Screw Placement [61.28459114068828]
We propose an intraoperative planning approach for robotic spine surgery that leverages real-time observation for drill path planning based on Safe Deep Reinforcement Learning (DRL)
Our approach was capable of achieving 90% bone penetration with respect to the gold standard (GS) drill planning.
arXiv Detail & Related papers (2023-05-09T11:42:53Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - A Diver Attention Estimation Framework for Effective Underwater Human-Robot Interaction [14.267807345588581]
Recent advancements in vision-based underwater HRI methods have the capability to interact with their human partners without requiring assistance from a topside operator.
In these methods, the AUV assumes that the diver is ready for interaction, while in reality, the diver may be distracted.
This paper presents a diver attention estimation framework for AUVs to autonomously determine the attentiveness of a diver.
arXiv Detail & Related papers (2022-09-28T22:08:41Z) - Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation [78.17108227614928]
We propose a benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation.
We consider a value-based and policy-gradient Deep Reinforcement Learning (DRL)
We also propose a verification strategy that checks the behavior of the trained models over a set of desired properties.
arXiv Detail & Related papers (2021-12-16T16:53:56Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - A Novel Sample-efficient Deep Reinforcement Learning with Episodic
Policy Transfer for PID-Based Control in Cardiac Catheterization Robots [2.3939470784308914]
The model was validated for axial motion control of a robotic system designed for intravascular catheterization.
Performance comparison with conventional methods in average of 10 trials shows the agent tunes the gain better with error of 0.003 mm.
arXiv Detail & Related papers (2021-10-28T08:18:01Z) - Risk-Sensitive Sequential Action Control with Multi-Modal Human
Trajectory Forecasting for Safe Crowd-Robot Interaction [55.569050872780224]
We present an online framework for safe crowd-robot interaction based on risk-sensitive optimal control, wherein the risk is modeled by the entropic risk measure.
Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control.
A simulation study and a real-world experiment show that the proposed framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
arXiv Detail & Related papers (2020-09-12T02:02:52Z) - BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates [13.330256356398243]
This paper introduces the software package BoXHED for nonparametrically estimating hazard functions via gradient boosting.
BoXHED is the first publicly available software implementation for Lee Chen, Ishwaran.
arXiv Detail & Related papers (2020-06-25T07:32:14Z) - Learning for Dose Allocation in Adaptive Clinical Trials with Safety
Constraints [84.09488581365484]
Phase I dose-finding trials are increasingly challenging as the relationship between efficacy and toxicity of new compounds becomes more complex.
Most commonly used methods in practice focus on identifying a Maximum Tolerated Dose (MTD) by learning only from toxicity events.
We present a novel adaptive clinical trial methodology that aims at maximizing the cumulative efficacies while satisfying the toxicity safety constraint with high probability.
arXiv Detail & Related papers (2020-06-09T03:06:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.