Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks
- URL: http://arxiv.org/abs/2408.10556v2
- Date: Fri, 22 Nov 2024 02:46:30 GMT
- Title: Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks
- Authors: Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji,
- Abstract summary: We propose Hokoff, a comprehensive set of pre-collected datasets that covers offline RL and offline MARL.
This data is derived from Honor of Kings, a recognized Multiplayer Online Battle Arena (MOBA) game.
We also introduce a novel baseline algorithm tailored for the inherent hierarchical action space of the game.
- Score: 59.50879251101105
- License:
- Abstract: The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications. However, existing datasets often fall short in their simplicity and lack of realism. To address this gap, we propose Hokoff, a comprehensive set of pre-collected datasets that covers both offline RL and offline MARL, accompanied by a robust framework, to facilitate further research. This data is derived from Honor of Kings, a recognized Multiplayer Online Battle Arena (MOBA) game known for its intricate nature, closely resembling real-life situations. Utilizing this framework, we benchmark a variety of offline RL and offline MARL algorithms. We also introduce a novel baseline algorithm tailored for the inherent hierarchical action space of the game. We reveal the incompetency of current offline RL approaches in handling task complexity, generalization and multi-task learning.
Related papers
- ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories [27.5648276335047]
Training autonomous agents with sparse rewards is a long-standing problem in online reinforcement learning (RL)
We propose a novel approach that leverages offline data to learn a generative diffusion model, coined as Adaptive Trajectory diffuser (ATraDiff)
ATraDiff consistently achieves state-of-the-art performance across a variety of environments, with particularly pronounced improvements in complicated settings.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - Offline Fictitious Self-Play for Competitive Games [34.445740191223614]
Off-FSP is the first model-free offline RL algorithm for competitive games.
This paper introduces Off-FSP, the first practical model-free offline RL algorithm for competitive games.
arXiv Detail & Related papers (2024-02-29T11:36:48Z) - H2O+: An Improved Framework for Hybrid Offline-and-Online RL with
Dynamics Gaps [31.608209251850553]
We develop a new algorithm, called H2O+, which offers great flexibility to bridge various choices of offline and online learning methods.
We demonstrate superior performance and flexibility over advanced cross-domain online and offline RL algorithms.
arXiv Detail & Related papers (2023-09-22T08:58:22Z) - CLUE: Calibrated Latent Guidance for Offline Reinforcement Learning [31.49713012907863]
We introduce textbfCalibrated textbfLatent gtextbfUidanctextbfE (CLUE), which utilizes a conditional variational auto-encoder to learn a latent space.
We instantiate the expert-driven intrinsic rewards in sparse-reward offline RL tasks, offline imitation learning (IL) tasks, and unsupervised offline RL tasks.
arXiv Detail & Related papers (2023-06-23T09:57:50Z) - Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning [93.99377042564919]
This paper tries to build more flexible constraints for value estimation without impeding the exploration of potential advantages.
The key idea is to leverage off-the-shelf RL simulators, which can be easily interacted with in an online manner, as the "test bed" for offline policies.
We introduce CoWorld, a model-based RL approach that mitigates cross-domain discrepancies in state and reward spaces.
arXiv Detail & Related papers (2023-05-24T15:45:35Z) - Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid
Reinforcement Learning [66.43003402281659]
A central question boils down to how to efficiently utilize online data collection to strengthen and complement the offline dataset.
We design a three-stage hybrid RL algorithm that beats the best of both worlds -- pure offline RL and pure online RL.
The proposed algorithm does not require any reward information during data collection.
arXiv Detail & Related papers (2023-05-17T15:17:23Z) - Offline Equilibrium Finding [40.08360411502593]
We aim to generalize Offline RL to a multi-agent or multiplayer-game setting.
Very little research has been done in this area, as the progress is hindered by the lack of standardized datasets and meaningful benchmarks.
Our two model-based algorithms -- OEF-PSRO and OEF-CFR -- are adaptations of the widely-used equilibrium finding algorithms Deep CFR and PSRO in the context of offline learning.
arXiv Detail & Related papers (2022-07-12T03:41:06Z) - Critic Regularized Regression [70.8487887738354]
We propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR)
We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces.
arXiv Detail & Related papers (2020-06-26T17:50:26Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z) - D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.