Generating Personas for Games with Multimodal Adversarial Imitation
Learning
- URL: http://arxiv.org/abs/2308.07598v1
- Date: Tue, 15 Aug 2023 06:58:19 GMT
- Title: Generating Personas for Games with Multimodal Adversarial Imitation
Learning
- Authors: William Ahlberg, Alessandro Sestini, Konrad Tollmar, Linus Gissl\'en
- Abstract summary: Reinforcement learning has been widely successful in producing agents capable of playing games at a human level.
Going beyond reinforcement learning is necessary to model a wide range of human playstyles.
This paper presents a novel imitation learning approach to generate multiple persona policies for playtesting.
- Score: 47.70823327747952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning has been widely successful in producing agents capable
of playing games at a human level. However, this requires complex reward
engineering, and the agent's resulting policy is often unpredictable. Going
beyond reinforcement learning is necessary to model a wide range of human
playstyles, which can be difficult to represent with a reward function. This
paper presents a novel imitation learning approach to generate multiple persona
policies for playtesting. Multimodal Generative Adversarial Imitation Learning
(MultiGAIL) uses an auxiliary input parameter to learn distinct personas using
a single-agent model. MultiGAIL is based on generative adversarial imitation
learning and uses multiple discriminators as reward models, inferring the
environment reward by comparing the agent and distinct expert policies. The
reward from each discriminator is weighted according to the auxiliary input.
Our experimental analysis demonstrates the effectiveness of our technique in
two environments with continuous and discrete action spaces.
Related papers
- Multi-Agent Training for Pommerman: Curriculum Learning and Population-based Self-Play Approach [11.740631954398292]
Pommerman is an ideal benchmark for multi-agent training, providing a battleground for two teams with communication capabilities among allied agents.
This study introduces a system designed to train multi-agent systems to play Pommerman using a combination of curriculum learning and population-based self-play.
arXiv Detail & Related papers (2024-06-30T11:14:29Z) - MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced
Active Learning [14.06682547001011]
State-of-the art methods typically focus on learning a single reward model.
We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms.
Our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies.
arXiv Detail & Related papers (2021-12-30T19:21:03Z) - Independent Learning in Stochastic Games [16.505046191280634]
We present the model of games for multi-agent learning in dynamic environments.
We focus on the development of simple and independent learning dynamics for games.
We present our recently proposed simple and independent learning dynamics that guarantee convergence in zero-sum games.
arXiv Detail & Related papers (2021-11-23T09:27:20Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Policy Fusion for Adaptive and Customizable Reinforcement Learning
Agents [137.86426963572214]
We show how to combine distinct behavioral policies to obtain a meaningful "fusion" policy.
We propose four different policy fusion methods for combining pre-trained policies.
We provide several practical examples and use-cases for how these methods are indeed useful for video game production and designers.
arXiv Detail & Related papers (2021-04-21T16:08:44Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Opponent Learning Awareness and Modelling in Multi-Objective Normal Form
Games [5.0238343960165155]
It is essential for an agent to learn about the behaviour of other agents in the system.
We present the first study of the effects of such opponent modelling on multi-objective multi-agent interactions with non-linear utilities.
arXiv Detail & Related papers (2020-11-14T12:35:32Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Never Give Up: Learning Directed Exploration Strategies [63.19616370038824]
We propose a reinforcement learning agent to solve hard exploration games by learning a range of directed exploratory policies.
We construct an episodic memory-based intrinsic reward using k-nearest neighbors over the agent's recent experience to train the directed exploratory policies.
A self-supervised inverse dynamics model is used to train the embeddings of the nearest neighbour lookup, biasing the novelty signal towards what the agent can control.
arXiv Detail & Related papers (2020-02-14T13:57:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.