Checkmating One, by Using Many: Combining Mixture of Experts with MCTS
to Improve in Chess
- URL: http://arxiv.org/abs/2401.16852v2
- Date: Sat, 10 Feb 2024 09:37:50 GMT
- Title: Checkmating One, by Using Many: Combining Mixture of Experts with MCTS
to Improve in Chess
- Authors: Felix Helfenstein, Jannis Bl\"uml, Johannes Czech and Kristian
Kersting
- Abstract summary: This paper presents a new approach that integrates deep learning with computational chess, using both the Mixture of Experts (MoE) method and Monte-Carlo Tree Search (MCTS)
Our framework combines the MoE method with MCTS, in order to align it with the strategic phases of chess, thus departing from the conventional one-for-all'' model.
Our empirical research shows a substantial improvement in playing strength, surpassing the traditional single-model framework.
- Score: 20.043363738256176
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper presents a new approach that integrates deep learning with
computational chess, using both the Mixture of Experts (MoE) method and
Monte-Carlo Tree Search (MCTS). Our methodology employs a suite of specialized
models, each designed to respond to specific changes in the game's input data.
This results in a framework with sparsely activated models, which provides
significant computational benefits. Our framework combines the MoE method with
MCTS, in order to align it with the strategic phases of chess, thus departing
from the conventional ``one-for-all'' model. Instead, we utilize distinct game
phase definitions to effectively distribute computational tasks across multiple
expert neural networks. Our empirical research shows a substantial improvement
in playing strength, surpassing the traditional single-model framework. This
validates the efficacy of our integrated approach and highlights the potential
of incorporating expert knowledge and strategic principles into neural network
design. The fusion of MoE and MCTS offers a promising avenue for advancing
machine learning architectures.
Related papers
- High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts [75.85448576746373]
We propose a method of grouping and pruning similar experts to improve the model's parameter efficiency.
We validate the effectiveness of our method by pruning three state-of-the-art MoE architectures.
The evaluation shows that our method outperforms other model pruning methods on a range of natural language tasks.
arXiv Detail & Related papers (2024-07-12T17:25:02Z) - Noise-powered Multi-modal Knowledge Graph Representation Framework [52.95468915728721]
The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph representation learning framework.
We propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking.
Our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility.
arXiv Detail & Related papers (2024-03-11T15:48:43Z) - Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner.
Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server.
This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z) - An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device.
In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters.
We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z) - An Approach for Combining Multimodal Fusion and Neural Architecture
Search Applied to Knowledge Tracing [6.540879944736641]
We propose a sequential model based optimization approach that combines multimodal fusion and neural architecture search within one framework.
We evaluate our methods on two public real datasets showing the discovered model is able to achieve superior performance.
arXiv Detail & Related papers (2021-11-08T13:43:46Z) - Mixture of ELM based experts with trainable gating network [2.320417845168326]
We propose an ensemble learning method based on mixture of experts.
The structure of ME consists of multi layer perceptrons (MLPs) as base experts and gating network.
In the proposed method a trainable gating network is applied to aggregate the outputs of the experts.
arXiv Detail & Related papers (2021-05-25T07:13:35Z) - Model-Based Machine Learning for Communications [110.47840878388453]
We review existing strategies for combining model-based algorithms and machine learning from a high level perspective.
We focus on symbol detection, which is one of the fundamental tasks of communication receivers.
arXiv Detail & Related papers (2021-01-12T19:55:34Z) - Nested Mixture of Experts: Cooperative and Competitive Learning of
Hybrid Dynamical System [2.055949720959582]
We propose a nested mixture of experts (NMOE) for representing and learning hybrid dynamical systems.
An NMOE combines both white-box and black-box models while optimizing bias-variance trade-off.
An NMOE provides a structured method for incorporating various types of prior knowledge by training the associative experts cooperatively or competitively.
arXiv Detail & Related papers (2020-11-20T19:36:45Z) - Reinforcement Learning for Variable Selection in a Branch and Bound
Algorithm [0.10499611180329801]
We leverage patterns in real-world instances to learn from scratch a new branching strategy optimised for a given problem.
We propose FMSTS, a novel Reinforcement Learning approach specifically designed for this task.
arXiv Detail & Related papers (2020-05-20T13:15:48Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.