Reward Function Design for Crowd Simulation via Reinforcement Learning
- URL: http://arxiv.org/abs/2309.12841v1
- Date: Fri, 22 Sep 2023 12:55:30 GMT
- Title: Reward Function Design for Crowd Simulation via Reinforcement Learning
- Authors: Ariel Kwiatkowski, Vicky Kalogeiton, Julien Pettr\'e, Marie-Paule Cani
- Abstract summary: Reinforcement learning has shown great potential in simulating virtual crowds, but the design of the reward function is critical to achieving effective and efficient results.
We provide theoretical insights on the validity of certain reward functions according to their analytical properties, and evaluate them empirically using a range of scenarios.
Our findings can inform the development of new crowd simulation techniques, and contribute to the wider study of human-like navigation.
- Score: 12.449513548800466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Crowd simulation is important for video-games design, since it enables to
populate virtual worlds with autonomous avatars that navigate in a human-like
manner. Reinforcement learning has shown great potential in simulating virtual
crowds, but the design of the reward function is critical to achieving
effective and efficient results. In this work, we explore the design of reward
functions for reinforcement learning-based crowd simulation. We provide
theoretical insights on the validity of certain reward functions according to
their analytical properties, and evaluate them empirically using a range of
scenarios, using the energy efficiency as the metric. Our experiments show that
directly minimizing the energy usage is a viable strategy as long as it is
paired with an appropriately scaled guiding potential, and enable us to study
the impact of the different reward components on the behavior of the simulated
crowd. Our findings can inform the development of new crowd simulation
techniques, and contribute to the wider study of human-like navigation.
Related papers
- Human Simulacra: Benchmarking the Personification of Large Language Models [38.21708264569801]
Large language models (LLMs) are recognized as systems that closely mimic aspects of human intelligence.
This paper introduces a framework for constructing virtual characters' life stories from the ground up.
Experimental results demonstrate that our constructed simulacra can produce personified responses that align with their target characters.
arXiv Detail & Related papers (2024-02-28T09:11:14Z) - User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors.
Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z) - Understanding reinforcement learned crowds [9.358303424584902]
Reinforcement Learning methods are used to animate virtual agents.
It is not obvious what is their real impact, and how they affect the results.
We analyze some of these arbitrary choices in terms of their impact on the learning performance.
arXiv Detail & Related papers (2022-09-19T20:47:49Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - Improving Sample Efficiency of Value Based Models Using Attention and
Vision Transformers [52.30336730712544]
We introduce a deep reinforcement learning architecture whose purpose is to increase sample efficiency without sacrificing performance.
We propose a visually attentive model that uses transformers to learn a self-attention mechanism on the feature maps of the state representation.
We demonstrate empirically that this architecture improves sample complexity for several Atari environments, while also achieving better performance in some of the games.
arXiv Detail & Related papers (2022-02-01T19:03:03Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Intuitive Physics Guided Exploration for Sample Efficient Sim2real
Transfer [42.23861067181556]
This paper focuses on learning task-specific estimates of latent factors which allow approximation of real world trajectories in an ideal simulation environment.
We first introduce intuitive action groupings based on human physics knowledge and experience, which is then used to design novel strategies for interacting with the real environment.
We demonstrate our approach in a range of physics based tasks, and show that it achieves superior performance relative to other baselines, using only a limited number of real-world interactions.
arXiv Detail & Related papers (2021-04-18T10:03:26Z) - Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial
Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks.
RL does not work directly in the real-world, which is known as the sim-to-real transfer problem.
We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.