Optimizing Mario Adventures in a Constrained Environment
- URL: http://arxiv.org/abs/2312.14963v1
- Date: Thu, 14 Dec 2023 08:45:26 GMT
- Title: Optimizing Mario Adventures in a Constrained Environment
- Authors: Sanyam Jain
- Abstract summary: We learn playing Super Mario Bros. using Genetic Algorithm (MarioGA) and NeuroEvolution (MarioNE) techniques.
We formalise the SMB agent to maximize the total value of collected coins (reward) and maximising the total distance traveled (reward)
We provide a fivefold comparative analysis by plotting fitness plots, ability to finish different levels of world 1, and domain adaptation (transfer learning) of the trained models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This project proposes and compares a new way to optimise Super Mario Bros.
(SMB) environment where the control is in hand of two approaches, namely,
Genetic Algorithm (MarioGA) and NeuroEvolution (MarioNE). Not only we learn
playing SMB using these techniques, but also optimise it with constrains of
collection of coins and finishing levels. Firstly, we formalise the SMB agent
to maximize the total value of collected coins (reward) and maximising the
total distance traveled (reward) in order to finish the level faster (time
penalty) for both the algorithms. Secondly, we study MarioGA and its evaluation
function (fitness criteria) including its representation methods, crossover
used, mutation operator formalism, selection method used, MarioGA loop, and few
other parameters. Thirdly, MarioNE is applied on SMB where a population of ANNs
with random weights is generated, and these networks control Marios actions in
the game. Fourth, SMB is further constrained to complete the task within the
specified time, rebirths (deaths) within the limit, and performs actions or
moves within the maximum allowed moves, while seeking to maximize the total
coin value collected. This ensures an efficient way of finishing SMB levels.
Finally, we provide a fivefold comparative analysis by plotting fitness plots,
ability to finish different levels of world 1, and domain adaptation (transfer
learning) of the trained models.
Related papers
- Provably Efficient Generalized Lagrangian Policy Optimization for Safe
Multi-Agent Reinforcement Learning [105.7510838453122]
We examine online safe multi-agent reinforcement learning using constrained Markov games.
We develop an upper confidence reinforcement learning algorithm to solve this Lagrangian problem.
Our algorithm updates the minimax decision primal variables via online mirror descent and the dual variable via projected gradient step.
arXiv Detail & Related papers (2023-05-31T22:09:24Z) - SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM)
In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment.
Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z) - RAMario: Experimental Approach to Reptile Algorithm -- Reinforcement
Learning for Mario [0.0]
We implement the Reptile algorithm using the Super Mario Bros library and weights in Python, creating a neural network model.
We train the model using multiple tasks and episodes, choosing actions using the current neural network model, taking those actions in the environment, and updating the model using the Reptile algorithm.
Our results demonstrate that the Reptile algorithm provides a promising approach to few-shot learning in video game AI, with comparable or even better performance than the other two algorithms.
arXiv Detail & Related papers (2023-05-16T17:54:14Z) - MarioGPT: Open-Ended Text2Level Generation through Large Language Models [20.264940262622282]
Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way.
Here, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels.
arXiv Detail & Related papers (2023-02-12T19:12:24Z) - ApproxED: Approximate exploitability descent via learned best responses [61.17702187957206]
We study the problem of finding an approximate Nash equilibrium of games with continuous action sets.
We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile.
arXiv Detail & Related papers (2023-01-20T23:55:30Z) - Improving Deep Localized Level Analysis: How Game Logs Can Help [0.9645196221785693]
We present novel improvements to affect prediction by using a deep convolutional neural network (CNN) to predict player experience.
We test our approach on levels based on Super Mario Bros. (Infinite Mario Bros.) and Super Mario Bros.: The Lost Levels (Gwario)
arXiv Detail & Related papers (2022-12-07T00:05:16Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z) - Illuminating Mario Scenes in the Latent Space of a Generative
Adversarial Network [11.055580854275474]
We show how designers may specify gameplay measures to our system and extract high-quality (playable) levels with a diverse range of level mechanics.
An online user study shows how the different mechanics of the automatically generated levels affect subjective ratings of their perceived difficulty and appearance.
arXiv Detail & Related papers (2020-07-11T03:38:06Z) - Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep
Multi-Agent Reinforcement Learning [66.94149388181343]
We present a new version of a popular $Q$-learning algorithm for MARL.
We show that it can recover the optimal policy even with access to $Q*$.
We also demonstrate improved performance on predator-prey and challenging multi-agent StarCraft benchmark tasks.
arXiv Detail & Related papers (2020-06-18T18:34:50Z) - Chaos, Extremism and Optimism: Volume Analysis of Learning in Games [55.24050445142637]
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
arXiv Detail & Related papers (2020-05-28T13:47:09Z) - A Game Theoretic Framework for Model Based Reinforcement Learning [39.45066100705418]
Model-based reinforcement learning (MBRL) has recently gained immense interest due to its potential for sample efficiency and ability to incorporate off-policy data.
We develop a new framework that casts MBRL as a game between: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player.
Our framework is consistent with and provides a clear basis for gradients known to be important in practice from prior works.
arXiv Detail & Related papers (2020-04-16T17:51:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.