Checkmating One, by Using Many: Combining Mixture of Experts with MCTS
to Improve in Chess
- URL: http://arxiv.org/abs/2401.16852v2
- Date: Sat, 10 Feb 2024 09:37:50 GMT
- Title: Checkmating One, by Using Many: Combining Mixture of Experts with MCTS
to Improve in Chess
- Authors: Felix Helfenstein, Jannis Bl\"uml, Johannes Czech and Kristian
Kersting
- Abstract summary: This paper presents a new approach that integrates deep learning with computational chess, using both the Mixture of Experts (MoE) method and Monte-Carlo Tree Search (MCTS)
Our framework combines the MoE method with MCTS, in order to align it with the strategic phases of chess, thus departing from the conventional one-for-all'' model.
Our empirical research shows a substantial improvement in playing strength, surpassing the traditional single-model framework.
- Score: 20.043363738256176
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper presents a new approach that integrates deep learning with
computational chess, using both the Mixture of Experts (MoE) method and
Monte-Carlo Tree Search (MCTS). Our methodology employs a suite of specialized
models, each designed to respond to specific changes in the game's input data.
This results in a framework with sparsely activated models, which provides
significant computational benefits. Our framework combines the MoE method with
MCTS, in order to align it with the strategic phases of chess, thus departing
from the conventional ``one-for-all'' model. Instead, we utilize distinct game
phase definitions to effectively distribute computational tasks across multiple
expert neural networks. Our empirical research shows a substantial improvement
in playing strength, surpassing the traditional single-model framework. This
validates the efficacy of our integrated approach and highlights the potential
of incorporating expert knowledge and strategic principles into neural network
design. The fusion of MoE and MCTS offers a promising avenue for advancing
machine learning architectures.
Related papers
- Predicting Chess Puzzle Difficulty with Transformers [0.0]
We present GlickFormer, a novel transformer-based architecture that predicts chess puzzle difficulty by approximating the Glicko-2 rating system.<n>The proposed model utilizes a modified ChessFormer backbone for spatial feature extraction and incorporates temporal information via factorized transformer techniques.<n>Results demonstrate GlickFormer's superior performance compared to the state-of-the-art ChessFormer baseline across multiple metrics.
arXiv Detail & Related papers (2024-10-14T20:39:02Z) - High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts [75.85448576746373]
We propose a method of grouping and pruning similar experts to improve the model's parameter efficiency.
We validate the effectiveness of our method by pruning three state-of-the-art MoE architectures.
The evaluation shows that our method outperforms other model pruning methods on a range of natural language tasks.
arXiv Detail & Related papers (2024-07-12T17:25:02Z) - Noise-powered Multi-modal Knowledge Graph Representation Framework [52.95468915728721]
The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph representation learning framework.
We propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking.
Our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility.
arXiv Detail & Related papers (2024-03-11T15:48:43Z) - Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - All by Myself: Learning Individualized Competitive Behaviour with a
Contrastive Reinforcement Learning optimization [57.615269148301515]
In a competitive game scenario, a set of agents have to learn decisions that maximize their goals and minimize their adversaries' goals at the same time.
We propose a novel model composed of three neural layers that learn a representation of a competitive game, learn how to map the strategy of specific opponents, and how to disrupt them.
Our experiments demonstrate that our model achieves better performance when playing against offline, online, and competitive-specific models, in particular when playing against the same opponent multiple times.
arXiv Detail & Related papers (2023-10-02T08:11:07Z) - Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent
Models in Pommerman [14.668309037894586]
In combination with Reinforcement Learning, Monte-Carlo Tree Search has shown to outperform human grandmasters in games such as Chess, Shogi and Go.
We investigate techniques that transform general-sum multiplayer games into single-player and two-player games.
arXiv Detail & Related papers (2023-05-22T16:39:20Z) - Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner.
Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server.
This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z) - Exploring Adaptive MCTS with TD Learning in miniXCOM [0.0]
In this work, we explore on-line adaptivity in Monte Carlo tree search (MCTS) without requiring pre-training.
We present MCTS-TD, an adaptive MCTS algorithm improved with temporal difference learning.
We demonstrate our new approach on the game miniXCOM, a popular commercial franchise consisting of several turn-based tactical games.
arXiv Detail & Related papers (2022-10-10T21:04:25Z) - No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning.
We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures.
Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z) - An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device.
In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters.
We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z) - An Approach for Combining Multimodal Fusion and Neural Architecture
Search Applied to Knowledge Tracing [6.540879944736641]
We propose a sequential model based optimization approach that combines multimodal fusion and neural architecture search within one framework.
We evaluate our methods on two public real datasets showing the discovered model is able to achieve superior performance.
arXiv Detail & Related papers (2021-11-08T13:43:46Z) - Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games [31.97631243571394]
We introduce a framework, LMAC, that automates the discovery of the update rule without explicit human design.
Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance.
We show that LMAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO.
arXiv Detail & Related papers (2021-06-04T22:30:25Z) - Mixture of ELM based experts with trainable gating network [2.320417845168326]
We propose an ensemble learning method based on mixture of experts.
The structure of ME consists of multi layer perceptrons (MLPs) as base experts and gating network.
In the proposed method a trainable gating network is applied to aggregate the outputs of the experts.
arXiv Detail & Related papers (2021-05-25T07:13:35Z) - Dual Monte Carlo Tree Search [0.0]
We show that Dual MCTS performs better than one of the most widely used neural MCTS algorithms, AlphaZero, for various symmetric and asymmetric games.
Dual MCTS uses two different search trees, a single deep neural network, and a new update technique for the search trees using a combination of the PUCB, a sliding-window, and the epsilon-greedy algorithm.
arXiv Detail & Related papers (2021-03-21T23:34:11Z) - Model-Based Machine Learning for Communications [110.47840878388453]
We review existing strategies for combining model-based algorithms and machine learning from a high level perspective.
We focus on symbol detection, which is one of the fundamental tasks of communication receivers.
arXiv Detail & Related papers (2021-01-12T19:55:34Z) - Nested Mixture of Experts: Cooperative and Competitive Learning of
Hybrid Dynamical System [2.055949720959582]
We propose a nested mixture of experts (NMOE) for representing and learning hybrid dynamical systems.
An NMOE combines both white-box and black-box models while optimizing bias-variance trade-off.
An NMOE provides a structured method for incorporating various types of prior knowledge by training the associative experts cooperatively or competitively.
arXiv Detail & Related papers (2020-11-20T19:36:45Z) - Reinforcement Learning for Variable Selection in a Branch and Bound
Algorithm [0.10499611180329801]
We leverage patterns in real-world instances to learn from scratch a new branching strategy optimised for a given problem.
We propose FMSTS, a novel Reinforcement Learning approach specifically designed for this task.
arXiv Detail & Related papers (2020-05-20T13:15:48Z) - Neural MMO v1.3: A Massively Multiagent Game Environment for Training
and Evaluating Neural Networks [48.5733173329785]
We present Neural MMO, a massively multiagent game environment inspired by MMOs.
We discuss our progress on two more general challenges in multiagent systems engineering for AI research: distributed infrastructure and game IO.
arXiv Detail & Related papers (2020-01-31T18:50:02Z) - Smooth markets: A basic mechanism for organizing gradient-based learners [47.34060971879986]
We introduce smooth markets (SM-games), a class of n-player games with pairwise zero sum interactions.
SM-games codify a common design pattern in machine learning that includes (some) GANs, adversarial training, and other recent algorithms.
We show that SM-games are amenable to analysis and optimization using first-order methods.
arXiv Detail & Related papers (2020-01-14T09:19:39Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.