The ProfessionAl Go annotation datasEt (PAGE)
- URL: http://arxiv.org/abs/2211.01559v1
- Date: Thu, 3 Nov 2022 02:41:41 GMT
- Title: The ProfessionAl Go annotation datasEt (PAGE)
- Authors: Yifan Gao, Danni Zhang and Haoyue Li
- Abstract summary: We present the ProfessionsEt dataset, containing 98,525 games played by 2,007 professional players and spans over 70 years.
The dataset includes rich AI analysis results for each move. Moreover, PAGE provides detailed metadata for every player and game after manual cleaning and labeling.
- Score: 3.1723119892509573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The game of Go has been highly under-researched due to the lack of game
records and analysis tools. In recent years, the increasing number of
professional competitions and the advent of AlphaZero-based algorithms provide
an excellent opportunity for analyzing human Go games on a large scale. In this
paper, we present the ProfessionAl Go annotation datasEt (PAGE), containing
98,525 games played by 2,007 professional players and spans over 70 years. The
dataset includes rich AI analysis results for each move. Moreover, PAGE
provides detailed metadata for every player and game after manual cleaning and
labeling. Beyond the preliminary analysis of the dataset, we provide sample
tasks that benefit from our dataset to demonstrate the potential application of
PAGE in multiple research directions. To the best of our knowledge, PAGE is the
first dataset with extensive annotation in the game of Go. This work is an
extended version of [1] where we perform a more detailed description, analysis,
and application.
Related papers
- BookWorm: A Dataset for Character Description and Analysis [59.186325346763184]
We define two tasks: character description, which generates a brief factual profile, and character analysis, which offers an in-depth interpretation.
We introduce the BookWorm dataset, pairing books from the Gutenberg Project with human-written descriptions and analyses.
Our findings show that retrieval-based approaches outperform hierarchical ones in both tasks.
arXiv Detail & Related papers (2024-10-14T10:55:58Z) - LOGO: A Long-Form Video Dataset for Group Action Quality Assessment [63.53109605625047]
We construct a new multi-person long-form video dataset for action quality assessment named LOGO.
Our dataset contains 200 videos from 26 artistic swimming events with 8 athletes in each sample along with an average duration of 204.2 seconds.
As for richness in annotations, LOGO includes formation labels to depict group information of multiple athletes and detailed annotations on action procedures.
arXiv Detail & Related papers (2024-04-07T17:51:53Z) - PlayMyData: a curated dataset of multi-platform video games [2.7498981662768536]
PlayMyData is a curated dataset composed of 99,864 multi-platform games gathered by IGDB website.
By exploiting a dedicated API, we collect relevant metadata for each game, e.g., description, genre, rating, gameplay video URLs, and screenshots.
PlayMyData can be used to foster cross-domain investigations built on top of the provided multimedia data.
arXiv Detail & Related papers (2024-01-16T18:45:38Z) - Graph Encoding and Neural Network Approaches for Volleyball Analytics:
From Game Outcome to Individual Play Predictions [5.399740513992854]
We introduce a specialized graph encoding technique to add contact-by-contact volleyball context to an already available volleyball dataset.
We demonstrate the potential benefits of using graph neural networks (GNNs) on this enriched dataset for three different volleyball prediction tasks.
Our results show that the use of GNNs with our graph encoding yields a much more advanced analysis of the data.
arXiv Detail & Related papers (2023-08-22T02:51:42Z) - GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
Data Exploration [97.68234051078997]
We discuss how Pyserini can be integrated with the Hugging Face ecosystem of open-source AI libraries and artifacts.
We include a Jupyter Notebook-based walk through the core interoperability features, available on GitHub.
We present GAIA Search - a search engine built following previously laid out principles, giving access to four popular large-scale text collections.
arXiv Detail & Related papers (2023-06-02T12:09:59Z) - Sports Video Analysis on Large-Scale Data [10.24207108909385]
This paper investigates the modeling of automated machine description on sports video.
We propose a novel large-scale NBA dataset for Sports Video Analysis (NSVA) with a focus on captioning.
arXiv Detail & Related papers (2022-08-09T16:59:24Z) - SC2EGSet: StarCraft II Esport Replay and Game-state Dataset [0.0]
This work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments.
We have publicly available game-engine generated "replays" of tournament matches and performed data extraction using a low-level application programming interface (API) library.
Our dataset contains replays from major and premiere StarCraft II tournaments since 2016.
arXiv Detail & Related papers (2022-07-07T16:52:53Z) - PGD: A Large-scale Professional Go Dataset for Data-driven Analytics [3.747666374070152]
This paper creates the Professional Go dataset, containing 98,043 games played by 2,148 professional players from 1950 to 2021.
The dataset includes analysis results for each move in the match evaluated by advanced AlphaZero-based AI.
With the help of complete meta-information and constructed in-game features, our results prediction system achieves an accuracy of 75.30%.
arXiv Detail & Related papers (2022-04-30T12:53:04Z) - Few-Shot Learning on Graphs: A Survey [92.47605211946149]
Graph representation learning has attracted tremendous attention due to its remarkable performance in many real-world applications.
semi-supervised graph representation learning models for specific tasks often suffer from label sparsity issue.
Few-shot learning on graphs (FSLG) has been proposed to tackle the performance degradation in face of limited annotated data challenge.
arXiv Detail & Related papers (2022-03-17T13:21:11Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z) - Comprehensive Instructional Video Analysis: The COIN Dataset and
Performance Evaluation [100.68317848808327]
We present a large-scale dataset named as "COIN" for COmprehensive INstructional video analysis.
COIN dataset contains 11,827 videos of 180 tasks in 12 domains related to our daily life.
With a new developed toolbox, all the videos are annotated efficiently with a series of step labels and the corresponding temporal boundaries.
arXiv Detail & Related papers (2020-03-20T16:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.