Related papers: BiRating -- Iterative averaging on a bipartite graph of Beat Saber scores, player skills, and map difficulties

BiRating -- Iterative averaging on a bipartite graph of Beat Saber scores, player skills, and map difficulties

URL: http://arxiv.org/abs/2502.19742v1
Date: Thu, 27 Feb 2025 04:07:53 GMT
Title: BiRating -- Iterative averaging on a bipartite graph of Beat Saber scores, player skills, and map difficulties
Authors: Juan Casanova,
Abstract summary: Difficulty estimation of Beat Saber maps is an interesting data analysis problem and valuable to the Beat Saber competitive scene.<n>We present a simple algorithm that iteratively averages player skill and map difficulty estimations in a bipartite graph of players and maps, connected by scores, using scores only as input.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Difficulty estimation of Beat Saber maps is an interesting data analysis problem and valuable to the Beat Saber competitive scene. We present a simple algorithm that iteratively averages player skill and map difficulty estimations in a bipartite graph of players and maps, connected by scores, using scores only as input. This approach simultaneously estimates player skills and map difficulties, exploiting each of them to improve the estimation of the other, exploitng the relation of multiple scores by different players on the same map, or on different maps by the same player. While we have been unable to prove or characterize theoretical convergence, the implementation exhibits convergent behaviour to low estimation error in all instances, producing accurate results. An informal qualitative evaluation involving experienced Beat Saber community members was carried out, comparing the difficulty estimations output by our algorithm with their personal perspectives on the difficulties of different maps. There was a significant alignment with player perceived perceptions of difficulty and with other existing methods for estimating difficulty. Our approach showed significant improvement over existing methods in certain known problematic maps that are not typically accurately estimated, but also produces problematic estimations for certain families of maps where the assumptions on the meaning of scores were inadequate (e.g. not enough scores, or scores over optimized by players). The algorithm has important limitations, related to data quality and meaningfulness, assumptions on the domain problem, and theoretical convergence of the algorithm. Future work would significantly benefit from a better understanding of adequate ways to quantify map difficulty in Beat Saber, including multidimensionality of skill and difficulty, and the systematic biases present in score data.

Related papers

ProgRoCC: A Progressive Approach to Rough Crowd Counting [66.09510514180593]
We label Rough Crowd Counting that delivers better accuracy on the basis of training data that is easier to acquire. We propose an approach to the rough crowd counting problem based on CLIP, termed ProgRoCC. Specifically, we introduce a progressive estimation learning strategy that determines the object count through a coarse-to-fine approach.
arXiv Detail & Related papers (2025-04-18T01:57:42Z)
Are You Doubtful? Oh, It Might Be Difficult Then! Exploring the Use of Model Uncertainty for Question Difficulty Estimation [12.638577140117702]
We show that uncertainty features contribute substantially to difficulty prediction, where difficulty is inversely proportional to the number of students who can correctly answer a question.<n>In addition to showing the value of our approach, we also observe that our model achieves state-of-the-art results on the BEA publicly available dataset.
arXiv Detail & Related papers (2024-12-16T14:55:09Z)
SureMap: Simultaneous Mean Estimation for Single-Task and Multi-Task Disaggregated Evaluation [75.56845750400116]
Disaggregated evaluation -- estimation of performance of a machine learning model on different subpopulations -- is a core task when assessing performance and group-fairness of AI systems. We develop SureMap that has high estimation accuracy for both multi-task and single-task disaggregated evaluations of blackbox models. Our method combines maximum a posteriori (MAP) estimation using a well-chosen prior together with cross-validation-free tuning via Stein's unbiased risk estimate (SURE)
arXiv Detail & Related papers (2024-11-14T17:53:35Z)
Question Difficulty Ranking for Multiple-Choice Reading Comprehension [3.273958158967657]
Multiple-choice (MC) tests are an efficient method to assess English learners. It is useful for test creators to rank candidate MC questions by difficulty during exam curation. We explore automated approaches to rank MC questions by difficulty.
arXiv Detail & Related papers (2024-04-16T16:23:10Z)
A Gold Standard Dataset for the Reviewer Assignment Problem [70.45113777449373]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.<n>Key challenge in comparing existing algorithms and developing better algorithms is the lack of publicly available gold-standard data.<n>We collect a novel dataset of similarity scores that we release to the research community.
arXiv Detail & Related papers (2023-03-23T16:15:03Z)
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games. We show that no algorithm attains no-regret in general-sum games when executed independently by all players. We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z)
Plug and Play Active Learning for Object Detection [12.50247484568549]
We introduce Plug and Play Active Learning (PPAL) for object detection. PPAL is a two-stage method comprising uncertainty-based and diversity-based sampling phases. We benchmark PPAL on the MS-COCO and Pascal VOC datasets using different detector architectures.
arXiv Detail & Related papers (2022-11-21T16:13:23Z)
Personalized Game Difficulty Prediction Using Factorization Machines [0.9558392439655011]
We contribute a new approach for personalized difficulty estimation of game levels, borrowing methods from content recommendation. We are able to predict difficulty as the number of attempts a player requires to pass future game levels, based on observed attempt counts from earlier levels and levels played by others. Our results suggest that FMs are a promising tool enabling game designers to both optimize player experience and learn more about their players and the game.
arXiv Detail & Related papers (2022-09-06T08:03:46Z)
Robust estimation algorithms don't need to know the corruption level [50.31562134370949]
Robust estimation algorithms can perform well even when part of the data is corrupt. Their vast majority approach optimal accuracy only when given a tight upper bound on the fraction of corrupt data. This brief note abstracts the complex and pervasive robustness problem into a simple geometric puzzle. It applies the puzzle's solution to derive a universal meta technique.
arXiv Detail & Related papers (2022-02-11T05:18:28Z)
Are Missing Links Predictable? An Inferential Benchmark for Knowledge Graph Completion [79.07695173192472]
InferWiki improves upon existing benchmarks in inferential ability, assumptions, and patterns. Each testing sample is predictable with supportive data in the training set. In experiments, we curate two settings of InferWiki varying in sizes and structures, and apply the construction process on CoDEx as comparative datasets.
arXiv Detail & Related papers (2021-08-03T09:51:15Z)
Statistical Modelling of Level Difficulty in Puzzle Games [0.0]
We formalise a model of level difficulty for puzzle games that goes beyond the classical probability of success. The model is fitted and evaluated on a dataset collected from the game Lily's Garden by Tactile Games.
arXiv Detail & Related papers (2021-07-05T13:47:28Z)
Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning. We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class. We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z)
Finding Game Levels with the Right Difficulty in a Few Trials through Intelligent Trial-and-Error [16.297059109611798]
Methods for dynamic difficulty adjustment allow games to be tailored to particular players to maximize their engagement. Current methods often only modify a limited set of game features such as the difficulty of the opponents, or the availability of resources. This paper presents a method that can generate and search for complete levels with a specific target difficulty in only a few trials.
arXiv Detail & Related papers (2020-05-15T17:48:18Z)
CNN-based Density Estimation and Crowd Counting: A Survey [65.06491415951193]
This paper comprehensively studies the crowd counting models, mainly CNN-based density map estimation methods. According to the evaluation metrics, we select the top three performers on their crowd counting datasets. We expect to make reasonable inference and prediction for the future development of crowd counting.
arXiv Detail & Related papers (2020-03-28T13:17:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.