Mixed Strategy Nash Equilibrium for Crowd Navigation
- URL: http://arxiv.org/abs/2403.01537v6
- Date: Mon, 25 Nov 2024 22:58:40 GMT
- Title: Mixed Strategy Nash Equilibrium for Crowd Navigation
- Authors: Max Muchen Sun, Francesca Baldini, Katie Hughes, Peter Trautman, Todd Murphey,
- Abstract summary: Game theory provides a framework for the robot to reason about potential cooperation from humans for collision avoidance during path planning.
Mixed strategy Nash equilibrium captures the negotiation behavior under uncertainty, making it well suited for crowd navigation.
We name our algorithm Bayesian Recursive Nash Equilibrium (BRNE) and develop a real-time model prediction crowd navigation framework.
- Score: 0.41942958779358663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robots navigating in crowded areas should negotiate free space with humans rather than fully controlling collision avoidance, as this can lead to freezing behavior. Game theory provides a framework for the robot to reason about potential cooperation from humans for collision avoidance during path planning. In particular, the mixed strategy Nash equilibrium captures the negotiation behavior under uncertainty, making it well suited for crowd navigation. However, computing the mixed strategy Nash equilibrium is often prohibitively expensive for real-time decision-making. In this paper, we propose an iterative Bayesian update scheme over probability distributions of trajectories. The algorithm simultaneously generates a stochastic plan for the robot and probabilistic predictions of other pedestrians' paths. We prove that the proposed algorithm is equivalent to solving a mixed strategy game for crowd navigation, and the algorithm guarantees the recovery of the global Nash equilibrium of the game. We name our algorithm Bayesian Recursive Nash Equilibrium (BRNE) and develop a real-time model prediction crowd navigation framework. Since BRNE is not solving a general-purpose mixed strategy Nash equilibrium but a tailored formula specifically for crowd navigation, it can compute the solution in real-time on a low-power embedded computer. We evaluate BRNE in both simulated environments and real-world pedestrian datasets. BRNE consistently outperforms non-learning and learning-based methods regarding safety and navigation efficiency. It also reaches human-level crowd navigation performance in the pedestrian dataset benchmark. Lastly, we demonstrate the practicality of our algorithm with real humans on an untethered quadruped robot with fully onboard perception and computation.
Related papers
- From Obstacles to Etiquette: Robot Social Navigation with VLM-Informed Path Selection [57.74400052368147]
This paper presents a social robot navigation framework that integrates geometric planning with contextual social reasoning.<n>The system first extracts obstacles and human dynamics to generate geometrically feasible candidate paths, then leverages a fine-tuned vision-language model (VLM) to evaluate these paths.<n>Experiments in four social navigation contexts demonstrate that our method achieves the best overall performance with the lowest personal space violation duration, the minimal pedestrian-facing time, and no social zone intrusions.
arXiv Detail & Related papers (2026-02-09T18:46:12Z) - Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees [91.88803125231189]
Multi-step Preference Optimization (MPO) is built upon the natural actor-critic frameworkciteprakhlin2013online,joulani17a.
We show that OMPO requires $mathcalO(epsilon-1)$ policy updates to converge to an $epsilon$-approximate Nash equilibrium.
We also validate the effectiveness of our method on multi-turn conversations dataset and math reasoning dataset.
arXiv Detail & Related papers (2025-02-18T09:33:48Z) - Generalizability of Graph Neural Networks for Decentralized Unlabeled Motion Planning [72.86540018081531]
Unlabeled motion planning involves assigning a set of robots to target locations while ensuring collision avoidance.
This problem forms an essential building block for multi-robot systems in applications such as exploration, surveillance, and transportation.
We address this problem in a decentralized setting where each robot knows only the positions of its $k$-nearest robots and $k$-nearest targets.
arXiv Detail & Related papers (2024-09-29T23:57:25Z) - Uncertainty-Aware DRL for Autonomous Vehicle Crowd Navigation in Shared Space [3.487370856323828]
This work introduces an integrated prediction and planning approach that incorporates the uncertainties of predicted pedestrian states in the training of a model-free DRL algorithm.
A novel reward function encourages the AV to respect pedestrians' personal space, decrease speed during close approaches, and minimize the collision probability with their predicted paths.
Results show a 40% decrease in collision rate and a 15% increase in minimum distance to pedestrians compared to the state of the art model that does not account for prediction uncertainty.
arXiv Detail & Related papers (2024-05-22T20:09:21Z) - Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation [50.01551945190676]
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning.
We propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures.
We demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation.
arXiv Detail & Related papers (2024-01-22T18:58:22Z) - REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world.
Recent methods aim to mitigate misalignment by learning reward functions from human preferences.
We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - Decentralized Social Navigation with Non-Cooperative Robots via Bi-Level
Optimization [11.638394339813154]
This paper presents a fully decentralized approach for realtime non-cooperative multi-robot navigation in social mini-games.
Our contribution is a new realtime bi-level optimization algorithm, in which the top-level optimization consists of computing a fair and collision-free ordering.
We successfully deploy the proposed algorithm in the real world using F$1/10$ robots, a Clearpath Jackal, and a Boston Dynamics Spot.
arXiv Detail & Related papers (2023-06-15T02:18:21Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - SocialVAE: Human Trajectory Prediction using Timewise Latents [4.640835690336652]
SocialVAE is a timewise variational autoencoder architecture that exploits posterior neural networks to perform prediction.
We show that SocialVAE improves current state-of-the-art pedestrian trajectory prediction benchmarks.
arXiv Detail & Related papers (2022-03-15T19:14:33Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - SABER: Data-Driven Motion Planner for Autonomously Navigating
Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal.
We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints.
recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution.
A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z) - XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision
Trees [55.9643422180256]
We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments.
Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm.
We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
arXiv Detail & Related papers (2021-04-22T01:33:10Z) - Towards a Systematic Computational Framework for Modeling Multi-Agent
Decision-Making at Micro Level for Smart Vehicles in a Smart World [8.899670429041453]
We propose a multi-agent based computational framework for modeling decision-making and strategic interaction at micro level for smart vehicles.
Our aim is to make the framework conceptually sound and practical for a range of realistic applications, including micro path planning for autonomous vehicles.
arXiv Detail & Related papers (2020-09-25T13:05:28Z) - Spatial-Temporal Block and LSTM Network for Pedestrian Trajectories
Prediction [0.0]
In this paper, we propose a novel LSTM-based algorithm for trajectory prediction.
We tackle the problem by considering the static scene and pedestrian.
It is LSTM that encodes the relationship so that our model predicts nodes trajectories in crowd scenarios simultaneously.
arXiv Detail & Related papers (2020-09-22T11:43:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.