OpenGuanDan: A Large-Scale Imperfect Information Game Benchmark
- URL: http://arxiv.org/abs/2602.00676v1
- Date: Sat, 31 Jan 2026 11:46:29 GMT
- Title: OpenGuanDan: A Large-Scale Imperfect Information Game Benchmark
- Authors: Chao Li, Shangdong Yang, Chiheng Zhan, Zhenxing Ge, Yujing Hu, Bingkun Bao, Xingguo Chen, Yang Gao,
- Abstract summary: OpenGuanDan is a novel benchmark that enables both efficient simulation of GuanDan and comprehensive evaluation of both learning-based and rule-based AI agents.<n>OpenGuanDan poses a suite of nontrivial challenges, including imperfect information, large-scale information set and action spaces, a mixed learning objective involving cooperation and competition, long-horizon decision-making, variable action spaces, and dynamic team composition.<n>We conduct two types of evaluations: (1) pairwise competitions among all GuanDan AI agents, and (2) human-AI matchups.
- Score: 31.554414017099102
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The advancement of data-driven artificial intelligence (AI), particularly machine learning, heavily depends on large-scale benchmarks. Despite remarkable progress across domains ranging from pattern recognition to intelligent decision-making in recent decades, exemplified by breakthroughs in board games, card games, and electronic sports games, there remains a pressing need for more challenging benchmarks to drive further research. To this end, this paper proposes OpenGuanDan, a novel benchmark that enables both efficient simulation of GuanDan (a popular four-player, multi-round Chinese card game) and comprehensive evaluation of both learning-based and rule-based GuanDan AI agents. OpenGuanDan poses a suite of nontrivial challenges, including imperfect information, large-scale information set and action spaces, a mixed learning objective involving cooperation and competition, long-horizon decision-making, variable action spaces, and dynamic team composition. These characteristics make it a demanding testbed for existing intelligent decision-making methods. Moreover, the independent API for each player allows human-AI interactions and supports integration with large language models. Empirically, we conduct two types of evaluations: (1) pairwise competitions among all GuanDan AI agents, and (2) human-AI matchups. Experimental results demonstrate that while current learning-based agents substantially outperform rule-based counterparts, they still fall short of achieving superhuman performance, underscoring the need for continued research in multi-agent intelligent decision-making domain. The project is publicly available at https://github.com/GameAI-NJUPT/OpenGuanDan.
Related papers
- Decision Making under Imperfect Recall: Algorithms and Benchmarks [77.12503122836422]
We introduce the first benchmark suite for imperfect-recall decision problems.<n>Our benchmarks capture a variety of problem types, including ones concerning privacy in AI systems.<n>We evaluate the performance of different algorithms for finding first-order optimal strategies in such problems.
arXiv Detail & Related papers (2026-02-16T23:19:01Z) - Ad-Hoc Human-AI Coordination Challenge [6.933020756939985]
We introduce the Ad-Hoc Human-AI Coordination Challenge (AH2AC2) to overcome the constraints of costly and difficult-to-reproduce human evaluations.<n>We develop textithuman proxy agents on a large-scale human dataset that serve as robust, cheap, and reproducible human-like evaluation partners.
arXiv Detail & Related papers (2025-06-26T17:19:52Z) - AGI Is Coming... Right After AI Learns to Play Wordle [4.2909314120969855]
multimodal agents, in particular, OpenAI's Computer-User Agent (CUA), trained to control and complete tasks through a standard computer interface, similar to humans.<n>We evaluated the agent's performance on the New York Times Wordle game to elicit model behaviors and identify shortcomings.
arXiv Detail & Related papers (2025-04-21T20:58:58Z) - DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan.
We first put forward an AI program named DanZero for this game.
In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z) - A Comprehensive Survey of AI-Generated Content (AIGC): A History of
Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC)
The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - DanZero: Mastering GuanDan Game with Reinforcement Learning [121.93690719186412]
Card game AI has always been a hot topic in the research of artificial intelligence.
In this paper, we are devoted to developing an AI program for a more complex card game, GuanDan.
We propose the first AI program DanZero for GuanDan using reinforcement learning technique.
arXiv Detail & Related papers (2022-10-31T06:29:08Z) - DIAMBRA Arena: a New Reinforcement Learning Platform for Research and
Experimentation [91.3755431537592]
This work presents DIAMBRA Arena, a new platform for reinforcement learning research and experimentation.
It features a collection of high-quality environments exposing a Python API fully compliant with OpenAI Gym standard.
They are episodic tasks with discrete actions and observations composed by raw pixels plus additional numerical values.
arXiv Detail & Related papers (2022-10-19T14:39:10Z) - TiKick: Toward Playing Multi-agent Football Full Games from Single-agent
Demonstrations [31.596018856092513]
Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
arXiv Detail & Related papers (2021-10-09T08:34:58Z) - OpenHoldem: An Open Toolkit for Large-Scale Imperfect-Information Game
Research [82.09426894653237]
OpenHoldem is an integrated toolkit for large-scale imperfect-information game research using NLTH.
OpenHoldem makes three main contributions to this research direction: 1) a standardized evaluation protocol for thoroughly evaluating different NLTH AIs, 2) three publicly available strong baselines for NLTH AI, and 3) an online testing platform with easy-to-use APIs for public NLTH AI evaluation.
arXiv Detail & Related papers (2020-12-11T07:24:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.