Towards Objective Metrics for Procedurally Generated Video Game Levels
- URL: http://arxiv.org/abs/2201.10334v1
- Date: Tue, 25 Jan 2022 14:13:50 GMT
- Title: Towards Objective Metrics for Procedurally Generated Video Game Levels
- Authors: Michael Beukman, Steven James and Christopher Cleghorn
- Abstract summary: We introduce two simulation-based evaluation metrics to measure the diversity and difficulty of generated levels.
We demonstrate that our diversity metric is more robust to changes in level size and representation than current methods.
The difficulty metric shows promise, as it correlates with existing estimates of difficulty in one of the tested domains, but it does face some challenges in the other domain.
- Score: 2.320417845168326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With increasing interest in procedural content generation by academia and
game developers alike, it is vital that different approaches can be compared
fairly. However, evaluating procedurally generated video game levels is often
difficult, due to the lack of standardised, game-independent metrics. In this
paper, we introduce two simulation-based evaluation metrics that involve
analysing the behaviour of an A* agent to measure the diversity and difficulty
of generated levels in a general, game-independent manner. Diversity is
calculated by comparing action trajectories from different levels using the
edit distance, and difficulty is measured as how much exploration and expansion
of the A* search tree is necessary before the agent can solve the level. We
demonstrate that our diversity metric is more robust to changes in level size
and representation than current methods and additionally measures factors that
directly affect playability, instead of focusing on visual information. The
difficulty metric shows promise, as it correlates with existing estimates of
difficulty in one of the tested domains, but it does face some challenges in
the other domain. Finally, to promote reproducibility, we publicly release our
evaluation framework.
Related papers
- Perceptual Similarity for Measuring Decision-Making Style and Policy Diversity in Games [28.289135305943056]
Defining and measuring decision-making styles, also known as playstyles, is crucial in gaming.
We introduce three enhancements to increase accuracy: multiscale analysis with varied state psychology, a perceptual kernel rooted in granularity, and the utilization of the intersection-over-union method for efficient evaluation.
Our findings improve the measurement of end-to-end game analysis and the evolution of artificial intelligence for diverse playstyles.
arXiv Detail & Related papers (2024-08-12T10:55:42Z) - POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation [76.67608003501479]
We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators.
The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Preference-conditioned Pixel-based AI Agent For Game Testing [1.5059676044537105]
Game-testing AI agents that learn by interaction with the environment have the potential to mitigate these challenges.
This paper proposes an agent design that mainly depends on pixel-based state observations while exploring the environment conditioned on a user's preference.
Our agent significantly outperforms state-of-the-art pixel-based game testing agents over exploration coverage and test execution quality when evaluated on a complex open-world environment resembling many aspects of real AAA games.
arXiv Detail & Related papers (2023-08-18T04:19:36Z) - Self-similarity Driven Scale-invariant Learning for Weakly Supervised
Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL)
We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features.
Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z) - Ordinal Regression for Difficulty Estimation of StepMania Levels [18.944506234623862]
We formalize and analyze the difficulty prediction task on StepMania levels as an ordinal regression (OR) task.
We evaluate many competitive OR and non-OR models, demonstrating that neural network-based models significantly outperform the state of the art.
We conclude with a user experiment showing our trained models' superiority over human labeling.
arXiv Detail & Related papers (2023-01-23T15:30:01Z) - Generating Game Levels of Diverse Behaviour Engagement [2.5739833468005595]
Experimental studies on emphSuper Mario Bros. indicate that using the same evaluation metrics but agents with different personas can generate levels for particular persona.
It implies that, for simple games, using a game-playing agent of specific player archetype as a level tester is probably all we need to generate levels of diverse behaviour engagement.
arXiv Detail & Related papers (2022-07-05T15:08:12Z) - Modeling Content Creator Incentives on Algorithm-Curated Platforms [76.53541575455978]
We study how algorithmic choices affect the existence and character of (Nash) equilibria in exposure games.
We propose tools for numerically finding equilibria in exposure games, and illustrate results of an audit on the MovieLens and LastFM datasets.
arXiv Detail & Related papers (2022-06-27T08:16:59Z) - Procedural Content Generation using Neuroevolution and Novelty Search
for Diverse Video Game Levels [2.320417845168326]
Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios.
However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content.
We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate diverse levels in real time.
arXiv Detail & Related papers (2022-04-14T12:54:32Z) - Uncertainty-aware Score Distribution Learning for Action Quality
Assessment [91.05846506274881]
We propose an uncertainty-aware score distribution learning (USDL) approach for action quality assessment (AQA)
Specifically, we regard an action as an instance associated with a score distribution, which describes the probability of different evaluated scores.
Under the circumstance where fine-grained score labels are available, we devise a multi-path uncertainty-aware score distributions learning (MUSDL) method to explore the disentangled components of a score.
arXiv Detail & Related papers (2020-06-13T15:41:29Z) - Towards Universal Representation Learning for Deep Face Recognition [106.21744671876704]
We propose a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge.
Experiments show that our method achieves top performance on general face recognition datasets such as LFW and MegaFace.
arXiv Detail & Related papers (2020-02-26T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.