Related papers: PGD: A Large-scale Professional Go Dataset for Data-driven Analytics

PGD: A Large-scale Professional Go Dataset for Data-driven Analytics

URL: http://arxiv.org/abs/2205.00254v1
Date: Sat, 30 Apr 2022 12:53:04 GMT
Title: PGD: A Large-scale Professional Go Dataset for Data-driven Analytics
Authors: Yifan Gao
Abstract summary: This paper creates the Professional Go dataset, containing 98,043 games played by 2,148 professional players from 1950 to 2021. The dataset includes analysis results for each move in the match evaluated by advanced AlphaZero-based AI. With the help of complete meta-information and constructed in-game features, our results prediction system achieves an accuracy of 75.30%.
Score: 3.747666374070152
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Lee Sedol is on a winning streak--does this legend rise again after the competition with AlphaGo? Ke Jie is invincible in the world championship--can he still win the title this time? Go is one of the most popular board games in East Asia, with a stable professional sports system that has lasted for decades in China, Japan, and Korea. There are mature data-driven analysis technologies for many sports, such as soccer, basketball, and esports. However, developing such technology for Go remains nontrivial and challenging due to the lack of datasets, meta-information, and in-game statistics. This paper creates the Professional Go Dataset (PGD), containing 98,043 games played by 2,148 professional players from 1950 to 2021. After manual cleaning and labeling, we provide detailed meta-information for each player, game, and tournament. Moreover, the dataset includes analysis results for each move in the match evaluated by advanced AlphaZero-based AI. To establish a benchmark for PGD, we further analyze the data and extract meaningful in-game features based on prior knowledge related to Go that can indicate the game status. With the help of complete meta-information and constructed in-game features, our results prediction system achieves an accuracy of 75.30%, much higher than several state-of-the-art approaches (64%-65%). As far as we know, PGD is the first dataset for data-driven analytics in Go and even in board games. Beyond this promising result, we provide more examples of tasks that benefit from our dataset. The ultimate goal of this paper is to bridge this ancient game and the modern data science community. It will advance research on Go-related analytics to enhance the fan experience, help players improve their ability, and facilitate other promising aspects. The dataset will be made publicly available.

Related papers

The Leaderboard Illusion [61.27964089648608]
Arena has emerged as the go-to leaderboard for ranking the most capable AI systems.<n>We identify systematic issues that have resulted in a distorted playing field.
arXiv Detail & Related papers (2025-04-29T15:48:49Z)
A Framework for Spatio-Temporal Graph Analytics In Field Sports [43.148818844265236]
We present an approach to construct Time-Window Spatial Activity Graphs (TWGs) for field sports. Using GPS data obtained from Gaelic Football matches we demonstrate how our approach can be utilised.
arXiv Detail & Related papers (2024-05-31T15:28:03Z)
Impact of a Batter in ODI Cricket Implementing Regression Models from Match Commentary [0.0]
This paper seeks to understand the conundrum behind this impactful performance by determining how much control a player has over the circumstances. We collected data for the entire One Day International career of 3 prominent cricket players: Rohit G Sharma, David A Warner, and Kane S Williamson. We used Multiple Linear Regression (MLR), Polynomial Regression, Support Vector Regression (SVR), Decision Tree Regression, and Random Forest Regression on each player's data individually to train them and predict the Impact the player will have on the game.
arXiv Detail & Related papers (2023-02-22T06:42:20Z)
The ProfessionAl Go annotation datasEt (PAGE) [3.1723119892509573]
We present the ProfessionsEt dataset, containing 98,525 games played by 2,007 professional players and spans over 70 years. The dataset includes rich AI analysis results for each move. Moreover, PAGE provides detailed metadata for every player and game after manual cleaning and labeling.
arXiv Detail & Related papers (2022-11-03T02:41:41Z)
DanZero: Mastering GuanDan Game with Reinforcement Learning [121.93690719186412]
Card game AI has always been a hot topic in the research of artificial intelligence. In this paper, we are devoted to developing an AI program for a more complex card game, GuanDan. We propose the first AI program DanZero for GuanDan using reinforcement learning technique.
arXiv Detail & Related papers (2022-10-31T06:29:08Z)
ESTA: An Esports Trajectory and Action Dataset [0.0]
We use esports data to develop machine learning models for win prediction. Awpy is an open-source library that can extract player trajectories and actions from game logs. ESTA is one of the largest and most granular publicly available sports data sets to date.
arXiv Detail & Related papers (2022-09-20T17:13:50Z)
GCN-WP -- Semi-Supervised Graph Convolutional Networks for Win Prediction in Esports [84.55775845090542]
We propose a semi-supervised win prediction model for esports based on graph convolutional networks. GCN-WP integrates over 30 features about the match and players and employs graph convolution to classify games based on their neighborhood. Our model achieves state-of-the-art prediction accuracy when compared to machine learning or skill rating models for LoL.
arXiv Detail & Related papers (2022-07-26T21:38:07Z)
SC2EGSet: StarCraft II Esport Replay and Game-state Dataset [0.0]
This work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments. We have publicly available game-engine generated "replays" of tournament matches and performed data extraction using a low-level application programming interface (API) library. Our dataset contains replays from major and premiere StarCraft II tournaments since 2016.
arXiv Detail & Related papers (2022-07-07T16:52:53Z)
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification [126.85096257968414]
We construct benchmarks that test the abilities of modern natural language understanding models. In this work, we propose gamification as a framework for data construction.
arXiv Detail & Related papers (2022-01-14T06:49:15Z)
Game Plan: What AI can do for Football, and What Football can do for AI [83.79507996785838]
Predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. We illustrate that football analytics is a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI.
arXiv Detail & Related papers (2020-11-18T10:26:02Z)
Collection and Validation of Psychophysiological Data from Professional and Amateur Players: a Multimodal eSports Dataset [7.135992354416602]
We present a dataset collected from professional and amateur teams in League of Legends video game with more than 40 hours of recordings. Recordings include the players' physiological activity, movements, pulse, saccades, obtained from various sensors. An important feature of the dataset is simultaneous data collection from five players, which facilitates the analysis of sensor data on a team level.
arXiv Detail & Related papers (2020-11-02T13:25:11Z)
Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport [51.20042288437171]
We propose a Two-Stage Spatial-Temporal Network (TSSTN) that can provide accurate real-time win predictions. Experiment results and applications in real-world live streaming scenarios showed that the proposed TSSTN model is effective both in prediction accuracy and interpretability.
arXiv Detail & Related papers (2020-08-14T12:00:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.