Related papers: Estimation of Skill Distributions

Estimation of Skill Distributions

URL: http://arxiv.org/abs/2006.08189v1
Date: Mon, 15 Jun 2020 07:35:37 GMT
Title: Estimation of Skill Distributions
Authors: Ali Jadbabaie and Anuran Makur and Devavrat Shah
Abstract summary: We study the problem of learning the skill distribution of a population of agents from observations of pairwise games in a tournament. We propose a simple and tractable algorithm that learns the skill density with near-optimal minimax mean squared error scaling. Our results shed light on the abundance of low quality funds prior to the Great Recession of 2008, and the domination of the industry by more skilled funds after the financial crisis.
Score: 39.29885444997579
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study the problem of learning the skill distribution of a population of agents from observations of pairwise games in a tournament. These games are played among randomly drawn agents from the population. The agents in our model can be individuals, sports teams, or Wall Street fund managers. Formally, we postulate that the likelihoods of game outcomes are governed by the Bradley-Terry-Luce (or multinomial logit) model, where the probability of an agent beating another is the ratio between its skill level and the pairwise sum of skill levels, and the skill parameters are drawn from an unknown skill density of interest. The problem is, in essence, to learn a distribution from noisy, quantized observations. We propose a simple and tractable algorithm that learns the skill density with near-optimal minimax mean squared error scaling as $n^{-1+\varepsilon}$, for any $\varepsilon>0$, when the density is smooth. Our approach brings together prior work on learning skill parameters from pairwise comparisons with kernel density estimation from non-parametric statistics. Furthermore, we prove minimax lower bounds which establish minimax optimality of the skill parameter estimation technique used in our algorithm. These bounds utilize a continuum version of Fano's method along with a covering argument. We apply our algorithm to various soccer leagues and world cups, cricket world cups, and mutual funds. We find that the entropy of a learnt distribution provides a quantitative measure of skill, which provides rigorous explanations for popular beliefs about perceived qualities of sporting events, e.g., soccer league rankings. Finally, we apply our method to assess the skill distributions of mutual funds. Our results shed light on the abundance of low quality funds prior to the Great Recession of 2008, and the domination of the industry by more skilled funds after the financial crisis.

Related papers

Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z)
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games. We show that no algorithm attains no-regret in general-sum games when executed independently by all players. We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z)
Decentralized model-free reinforcement learning in stochastic games with average-reward objective [1.9852463786440127]
We show that our algorithm achieves both sublinear high probability regret of order $T3/4$ and sublinear expected regret of order $T2/3$. Our algorithm enjoys a low computational complexity and low memory space requirement compared to the previous works.
arXiv Detail & Related papers (2023-01-13T15:59:53Z)
Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria. We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z)
Reinforcement Learning for Mean Field Games, with Applications to Economics [0.0]
Mean field games (MFG) and mean field control problems (MFC) are frameworks to study Nash equilibria or social optima in games with a continuum of agents. We present a two timescale approach with RL for MFG and MFC, which relies on a unified Q-learning algorithm.
arXiv Detail & Related papers (2021-06-25T16:45:04Z)
Simplified Kalman filter for online rating: one-fits-all approach [4.010371060637208]
We deal with the problem of rating in sports, where the skills of the players/teams are inferred from the observed outcomes of the games. Our focus is on the online rating algorithms which estimate the skills after each new game by exploiting the probabilistic models of the relationship between the skills and the game outcome.
arXiv Detail & Related papers (2021-04-28T20:44:10Z)
Estimating $\alpha$-Rank by Maximizing Information Gain [26.440923373794444]
Game theory has been increasingly applied in settings where the game is not known outright, but has to be estimated by sampling. In this paper, we focus on $alpha$-rank, a popular game-theoretic solution concept designed to perform well in such scenarios. We show the benefits of using information gain as compared to the confidence interval criterion of ResponseGraphUCB.
arXiv Detail & Related papers (2021-01-22T15:46:35Z)
Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games [123.76716667704625]
We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game. In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. We provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.
arXiv Detail & Related papers (2020-09-21T17:51:57Z)
Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.