SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
- URL: http://arxiv.org/abs/2207.03428v1
- Date: Thu, 7 Jul 2022 16:52:53 GMT
- Title: SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
- Authors: Andrzej Bia{\l}ecki, Natalia Jakubowska, Pawe{\l} Dobrowolski, Piotr
Bia{\l}ecki, Leszek Krupi\'nski, Andrzej Szczap, Robert Bia{\l}ecki, Jan
Gajewski
- Abstract summary: This work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments.
We have publicly available game-engine generated "replays" of tournament matches and performed data extraction using a low-level application programming interface (API) library.
Our dataset contains replays from major and premiere StarCraft II tournaments since 2016.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As a relatively new form of sport, esports offers unparalleled data
availability. Despite the vast amounts of data that are generated by game
engines, it can be challenging to extract them and verify their integrity for
the purposes of practical and scientific use.
Our work aims to open esports to a broader scientific community by supplying
raw and pre-processed files from StarCraft II esports tournaments. These files
can be used in statistical and machine learning modeling tasks and related to
various laboratory-based measurements (e.g., behavioral tests, brain imaging).
We have gathered publicly available game-engine generated "replays" of
tournament matches and performed data extraction and cleanup using a low-level
application programming interface (API) parser library.
Additionally, we open-sourced and published all the custom tools that were
developed in the process of creating our dataset. These tools include PyTorch
and PyTorch Lightning API abstractions to load and model the data.
Our dataset contains replays from major and premiere StarCraft II tournaments
since 2016. To prepare the dataset, we processed 55 tournament "replaypacks"
that contained 17930 files with game-state information. Based on initial
investigation of available StarCraft II datasets, we observed that our dataset
is the largest publicly available source of StarCraft II esports data upon its
publication.
Analysis of the extracted data holds promise for further Artificial
Intelligence (AI), Machine Learning (ML), psychological, Human-Computer
Interaction (HCI), and sports-related studies in a variety of supervised and
self-supervised tasks.
Related papers
- PlayMyData: a curated dataset of multi-platform video games [2.7498981662768536]
PlayMyData is a curated dataset composed of 99,864 multi-platform games gathered by IGDB website.
By exploiting a dedicated API, we collect relevant metadata for each game, e.g., description, genre, rating, gameplay video URLs, and screenshots.
PlayMyData can be used to foster cross-domain investigations built on top of the provided multimedia data.
arXiv Detail & Related papers (2024-01-16T18:45:38Z) - Technical Challenges of Deploying Reinforcement Learning Agents for Game
Testing in AAA Games [58.720142291102135]
We describe an effort to add an experimental reinforcement learning system to an existing automated game testing solution based on scripted bots.
We show a use-case of leveraging reinforcement learning in game production and cover some of the largest time sinks anyone who wants to make the same journey for their game may encounter.
We propose a few research directions that we believe will be valuable and necessary for making machine learning, and especially reinforcement learning, an effective tool in game production.
arXiv Detail & Related papers (2023-07-19T18:19:23Z) - Commentary Generation from Data Records of Multiplayer Strategy Esports Game [21.133690853111133]
We build large-scale datasets that pair structured data and commentaries from a popular esports game, League of Legends.
We then evaluate Transformer-based models to generate game commentaries from structured data records.
We will release our dataset to boost potential research in the data-to-text generation community.
arXiv Detail & Related papers (2022-12-21T11:23:31Z) - ESTA: An Esports Trajectory and Action Dataset [0.0]
We use esports data to develop machine learning models for win prediction.
Awpy is an open-source library that can extract player trajectories and actions from game logs.
ESTA is one of the largest and most granular publicly available sports data sets to date.
arXiv Detail & Related papers (2022-09-20T17:13:50Z) - Kubric: A scalable dataset generator [73.78485189435729]
Kubric is a Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines.
We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation.
arXiv Detail & Related papers (2022-03-07T18:13:59Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - AI-enabled Prediction of eSports Player Performance Using the Data from
Heterogeneous Sensors [12.071865017583502]
We report on an Artificial Intelligence (AI) enabled solution for predicting the eSports player in-game performance using exclusively the data from sensors.
The player performance is assessed from the game logs in a multiplayer game for each moment of time using a recurrent neural network.
The proposed solution has a number of promising applications for Pro eSports teams as well as a learning tool for amateur players.
arXiv Detail & Related papers (2020-12-07T07:31:53Z) - Collection and Validation of Psychophysiological Data from Professional
and Amateur Players: a Multimodal eSports Dataset [7.135992354416602]
We present a dataset collected from professional and amateur teams in League of Legends video game with more than 40 hours of recordings.
Recordings include the players' physiological activity, movements, pulse, saccades, obtained from various sensors.
An important feature of the dataset is simultaneous data collection from five players, which facilitates the analysis of sensor data on a team level.
arXiv Detail & Related papers (2020-11-02T13:25:11Z) - LID 2020: The Learning from Imperfect Data Challenge Results [242.86700551532272]
Learning from Imperfect Data workshop aims to inspire and facilitate the research in developing novel approaches.
We organize three challenges to find the state-of-the-art approaches in weakly supervised learning setting.
This technical report summarizes the highlights from the challenge.
arXiv Detail & Related papers (2020-10-17T13:06:12Z) - Design and Implementation of TAG: A Tabletop Games Framework [59.60094442546867]
This document describes the design and implementation of the Tabletop Games framework (TAG)
TAG is a Java-based benchmark for developing modern board games for AI research.
arXiv Detail & Related papers (2020-09-25T07:27:30Z) - MSC: A Dataset for Macro-Management in StarCraft II [52.52008929278214]
We release a new macro-management dataset based on the platform SC2LE.
MSC consists of well-designed feature vectors, pre-defined high-level actions and final result of each match.
Besides the dataset, we propose a baseline model and present initial baseline results for global state evaluation and build order prediction.
arXiv Detail & Related papers (2017-10-09T14:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.