Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning based Local Motion Planners
- URL: http://arxiv.org/abs/2410.12232v1
- Date: Wed, 16 Oct 2024 04:46:21 GMT
- Title: Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning based Local Motion Planners
- Authors: Wen Zheng Terence Ng, Jianda Chen, Sinno Jialin Pan, Tianwei Zhang,
- Abstract summary: Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements.
We introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective.
In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors.
- Score: 36.684452789236914
- License:
- Abstract: Deploying a safe mobile robot policy in scenarios with human pedestrians is challenging due to their unpredictable movements. Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements and could suffer from the over-fitting issue. Alternatively, framing the collision avoidance problem as a multi-agent framework, where agents generate dynamic movements while learning to reach their goals, can lead to conflicts with human pedestrians due to their homogeneity. To tackle this problem, we introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective. This diversity enriches each agent's experiences, improving its adaptability to unseen crowd behaviors. In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors. Our behavior-conditioned policies outperform existing works in these challenging scenes, reducing potential collisions without additional time or travel.
Related papers
- Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Robust multi-agent coordination via evolutionary generation of auxiliary
adversarial attackers [23.15190337027283]
We propose Robust Multi-Agent Coordination via Generation of Auxiliary Adversarial Attackers (ROMANCE)
ROMANCE enables the trained policy to encounter diversified and strong auxiliary adversarial attacks during training, thus achieving high robustness under various policy perturbations.
The goal of quality is to minimize the ego-system coordination effect, and a novel diversity regularizer is applied to diversify the behaviors among attackers.
arXiv Detail & Related papers (2023-05-10T05:29:47Z) - Robust and Versatile Bipedal Jumping Control through Reinforcement
Learning [141.56016556936865]
This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world.
We present a reinforcement learning framework for training a robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions.
We develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to a short-term I/O history.
arXiv Detail & Related papers (2023-02-19T01:06:09Z) - ForceFormer: Exploring Social Force and Transformer for Pedestrian
Trajectory Prediction [3.5163219821672618]
We propose a new goal-based trajectory predictor called ForceFormer.
We leverage the driving force from the destination to efficiently simulate the guidance of a target on a pedestrian.
Our proposed method achieves on-par performance measured by distance errors with the state-of-the-art models.
arXiv Detail & Related papers (2023-02-15T10:54:14Z) - An Energy-aware and Fault-tolerant Deep Reinforcement Learning based
approach for Multi-agent Patrolling Problems [0.5008597638379226]
We propose an approach based on model-free, deep multi-agent reinforcement learning.
Agents are trained to patrol an environment with various unknown dynamics and factors.
They can automatically recharge themselves to support continuous collective patrolling.
This architecture provides a patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance.
arXiv Detail & Related papers (2022-12-16T01:38:35Z) - Enhanced method for reinforcement learning based dynamic obstacle
avoidance by assessment of collision risk [0.0]
This paper proposes a general training environment where we gain control over the difficulty of the obstacle avoidance task.
We found that shifting the training towards a greater task difficulty can massively increase the final performance.
arXiv Detail & Related papers (2022-12-08T07:46:42Z) - Influencing Towards Stable Multi-Agent Interactions [12.477674452685756]
Learning in multi-agent environments is difficult due to the non-stationarity introduced by an opponent's or partner's changing behaviors.
We propose an algorithm to proactively influence the other agent's strategy to stabilize.
We demonstrate the effectiveness of stabilizing in improving efficiency of maximizing the task reward in a variety of simulated environments.
arXiv Detail & Related papers (2021-10-05T16:46:04Z) - Reinforcement Learning for Robust Parameterized Locomotion Control of
Bipedal Robots [121.42930679076574]
We present a model-free reinforcement learning framework for training robust locomotion policies in simulation.
domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics.
We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
arXiv Detail & Related papers (2021-03-26T07:14:01Z) - Learning Latent Representations to Influence Multi-Agent Interaction [65.44092264843538]
We propose a reinforcement learning-based framework for learning latent representations of an agent's policy.
We show that our approach outperforms the alternatives and learns to influence the other agent.
arXiv Detail & Related papers (2020-11-12T19:04:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.