Related papers: Mastering Chinese Chess AI (Xiangqi) Without Search

Mastering Chinese Chess AI (Xiangqi) Without Search

URL: http://arxiv.org/abs/2410.04865v1
Date: Mon, 7 Oct 2024 09:27:51 GMT
Title: Mastering Chinese Chess AI (Xiangqi) Without Search
Authors: Yu Chen, Juntong Lin, Zhichao Shu,
Abstract summary: We have developed a high-performance Chinese Chess AI that operates without reliance on search algorithms. This AI has demonstrated the capability to compete at a level commensurate with the top 0.1% of human players.
Score: 2.309569018066392
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We have developed a high-performance Chinese Chess AI that operates without reliance on search algorithms. This AI has demonstrated the capability to compete at a level commensurate with the top 0.1\% of human players. By eliminating the search process typically associated with such systems, this AI achieves a Queries Per Second (QPS) rate that exceeds those of systems based on the Monte Carlo Tree Search (MCTS) algorithm by over a thousandfold and surpasses those based on the AlphaBeta pruning algorithm by more than a hundredfold. The AI training system consists of two parts: supervised learning and reinforcement learning. Supervised learning provides an initial human-like Chinese chess AI, while reinforcement learning, based on supervised learning, elevates the strength of the entire AI to a new level. Based on this training system, we carried out enough ablation experiments and discovered that 1. The same parameter amount of Transformer architecture has a higher performance than CNN on Chinese chess; 2. Possible moves of both sides as features can greatly improve the training process; 3. Selective opponent pool, compared to pure self-play training, results in a faster improvement curve and a higher strength limit. 4. Value Estimation with Cutoff(VECT) improves the original PPO algorithm training process and we will give the explanation.

Related papers

SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths. We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z)
Maia-2: A Unified Model for Human-AI Alignment in Chess [10.577896749797485]
We propose a unified modeling approach for human-AI alignment in chess. We introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions. Our results pave the way for deeper insights into human decision-making and AI-guided teaching tools.
arXiv Detail & Related papers (2024-09-30T17:54:23Z)
Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills. We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval. GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z)
DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan. We first put forward an AI program named DanZero for this game. In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z)
Double A3C: Deep Reinforcement Learning on OpenAI Gym Games [0.0]
Reinforcement Learning (RL) is an area of machine learning figuring out how agents take actions in an unknown environment to maximize its rewards. We will propose and implement an improved version of Double A3C algorithm which utilizing the strength of both algorithms to play OpenAI Gym Atari 2600 games to beat its benchmarks.
arXiv Detail & Related papers (2023-03-04T00:06:27Z)
Instructive artificial intelligence (AI) for human training, assistance, and explainability [0.24629531282150877]
We show how a neural network might instruct human trainees as an alternative to traditional approaches to explainable AI (XAI) An AI examines human actions and calculates variations on the human strategy that lead to better performance. Results will be presented on AI instruction's ability to improve human decision-making and human-AI teaming in Hanabi.
arXiv Detail & Related papers (2021-11-02T16:46:46Z)
Method for making multi-attribute decisions in wargames by combining intuitionistic fuzzy numbers with reinforcement learning [18.04026817707759]
The article proposes an algorithm that combines the multi-attribute management and reinforcement learning methods. It solves the problem of the agent's low rate of winning against specific rules and its inability to quickly converge during intelligent wargame training. It is the first time in this field that an algorithm design for intelligent wargaming combines multi-attribute decision making with reinforcement learning.
arXiv Detail & Related papers (2021-09-06T10:45:52Z)
The MineRL BASALT Competition on Learning from Human Feedback [58.17897225617566]
The MineRL BASALT competition aims to spur forward research on this important class of techniques. We design a suite of four tasks in Minecraft for which we expect it will be hard to write down hardcoded reward functions. We provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline.
arXiv Detail & Related papers (2021-07-05T12:18:17Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch [76.83052807776276]
We show that it is possible to automatically discover complete machine learning algorithms just using basic mathematical operations as building blocks. We demonstrate this by introducing a novel framework that significantly reduces human bias through a generic search space. We believe these preliminary successes in discovering machine learning algorithms from scratch indicate a promising new direction in the field.
arXiv Detail & Related papers (2020-03-06T19:00:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.