SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning
- URL: http://arxiv.org/abs/2407.17460v2
- Date: Thu, 06 Feb 2025 18:55:45 GMT
- Title: SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning
- Authors: Jianpeng Yao, Xiaopan Zhang, Yu Xia, Zejin Wang, Amit K. Roy-Chowdhury, Jiachen Li,
- Abstract summary: SoNIC is the first algorithm that integrates adaptive conformal inference and constrained reinforcement learning.
Our method achieves a success rate of 96.93%, which is 11.67% higher than the previous state-of-the-art RL method.
Our experiments demonstrate that the system can generate robust and socially polite decision-making when interacting with both sparse and dense crowds.
- Score: 26.554847852013737
- License:
- Abstract: Reinforcement learning (RL) enables social robots to generate trajectories without relying on human-designed rules or interventions, making it generally more effective than rule-based systems in adapting to complex, dynamic real-world scenarios. However, social navigation is a safety-critical task that requires robots to avoid collisions with pedestrians, whereas existing RL-based solutions often fall short of ensuring safety in complex environments. In this paper, we propose SoNIC, which to the best of our knowledge is the first algorithm that integrates adaptive conformal inference (ACI) with constrained reinforcement learning (CRL) to enable safe policy learning for social navigation. Specifically, our method not only augments RL observations with ACI-generated nonconformity scores, which inform the agent of the quantified uncertainty but also employs these uncertainty estimates to effectively guide the behaviors of RL agents by using constrained reinforcement learning. This integration regulates the behaviors of RL agents and enables them to handle safety-critical situations. On the standard CrowdNav benchmark, our method achieves a success rate of 96.93%, which is 11.67% higher than the previous state-of-the-art RL method and results in 4.5 times fewer collisions and 2.8 times fewer intrusions to ground-truth human future trajectories as well as enhanced robustness in out-of-distribution scenarios. To further validate our approach, we deploy our algorithm on a real robot by developing a ROS2-based navigation system. Our experiments demonstrate that the system can generate robust and socially polite decision-making when interacting with both sparse and dense crowds. The video demos can be found on our project website: https://sonic-social-nav.github.io/.
Related papers
- Stranger Danger! Identifying and Avoiding Unpredictable Pedestrians in RL-based Social Robot Navigation [1.74102619371866]
We propose modifications to the learning process that encourage RL policies to maintain additional caution in unfamiliar situations.
Compared to the original SARL policy, our modified policy maintains similar navigation times and path lengths, while reducing the number of collisions by 82%.
We also describe how to apply these modifications to other RL policies and demonstrate that some key high-level behaviors of our approach transfer to a physical robot.
arXiv Detail & Related papers (2024-07-08T15:58:33Z) - REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world.
Recent methods aim to mitigate misalignment by learning reward functions from human preferences.
We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection.
Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior.
Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z) - Minimizing Safety Interference for Safe and Comfortable Automated
Driving with Distributional Reinforcement Learning [3.923354711049903]
We propose a distributional reinforcement learning framework to learn adaptive policies that can tune their level of conservativity at run-time based on the desired comfort and utility.
We show that our algorithm learns policies that can still drive reliable when the perception noise is two times higher than the training configuration for automated merging and crossing at occluded intersections.
arXiv Detail & Related papers (2021-07-15T13:36:55Z) - Rule-Based Reinforcement Learning for Efficient Robot Navigation with
Space Reduction [8.279526727422288]
In this paper, we focus on efficient navigation with the reinforcement learning (RL) technique.
We employ a reduction rule to shrink the trajectory, which in turn effectively reduces the redundant exploration space.
Experiments conducted on real robot navigation problems in hex-grid environments demonstrate that RuRL can achieve improved navigation performance.
arXiv Detail & Related papers (2021-04-15T07:40:27Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.