Mastering NIM and Impartial Games with Weak Neural Networks: An AlphaZero-inspired Multi-Frame Approach
- URL: http://arxiv.org/abs/2411.06403v1
- Date: Sun, 10 Nov 2024 09:34:26 GMT
- Title: Mastering NIM and Impartial Games with Weak Neural Networks: An AlphaZero-inspired Multi-Frame Approach
- Authors: Søren Riis,
- Abstract summary: This paper provides a theoretical framework that validates and explains the results in the work with Bei Zhou.
We show that AlphaZero-style reinforcement learning algorithms struggle to learn optimal play in NIM.
- Score: 0.0
- License:
- Abstract: This paper provides a theoretical framework that validates and explains the results in the work with Bei Zhou experimentally finding that AlphaZero-style reinforcement learning algorithms struggle to learn optimal play in NIM, a canonical impartial game proposed as an AI challenge by Harvey Friedman in 2017. Our analysis resolves a controversy around these experimental results, which revealed unexpected difficulties in learning NIM despite its mathematical simplicity compared to games like chess and Go. Our key contributions are as follows: We prove that by incorporating recent game history, these limited AlphaZero models can, in principle, achieve optimal play in NIM. We introduce a novel search strategy where roll-outs preserve game-theoretic values during move selection, guided by a specialised policy network. We provide constructive proofs showing that our approach enables optimal play within the \(\text{AC}^0\) complexity class despite the theoretical limitations of these networks. This research demonstrates how constrained neural networks when properly designed, can achieve sophisticated decision-making even in domains where their basic computational capabilities appear insufficient.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - In-Context Exploiter for Extensive-Form Games [38.24471816329584]
We introduce a novel method, In-Context Exploiter (ICE), to train a single model that can act as any player in the game and adaptively exploit opponents entirely by in-context learning.
Our ICE algorithm involves generating diverse opponent strategies, collecting interactive history data by a reinforcement learning algorithm, and training a transformer-based agent within a well-designed curriculum learning framework.
arXiv Detail & Related papers (2024-08-10T14:59:09Z) - Reasoning Algorithmically in Graph Neural Networks [1.8130068086063336]
We aim to integrate the structured and rule-based reasoning of algorithms with adaptive learning capabilities of neural networks.
This dissertation provides theoretical and practical contributions to this area of research.
arXiv Detail & Related papers (2024-02-21T12:16:51Z) - The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks.
We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z) - A Deep Reinforcement Learning Approach for Finding Non-Exploitable
Strategies in Two-Player Atari Games [35.35717637660101]
This paper proposes novel, end-to-end deep reinforcement learning algorithms for learning two-player zero-sum Markov games.
Our objective is to find the Nash Equilibrium policies, which are free from exploitation by adversarial opponents.
arXiv Detail & Related papers (2022-07-18T19:07:56Z) - Unsupervised Hebbian Learning on Point Sets in StarCraft II [12.095363582092904]
We present a novel Hebbian learning method to extract the global feature of point sets in StarCraft II game units.
Our model includes encoder, LSTM, and decoder, and we train the encoder with the unsupervised learning method.
arXiv Detail & Related papers (2022-07-13T13:09:48Z) - A unified stochastic approximation framework for learning in games [82.74514886461257]
We develop a flexible approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite)
The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, exponential/multiplicative weights for learning in finite games, optimistic and bandit variants of the above, etc.
arXiv Detail & Related papers (2022-06-08T14:30:38Z) - Neural Combinatorial Optimization: a New Player in the Field [69.23334811890919]
This paper presents a critical analysis on the incorporation of algorithms based on neural networks into the classical optimization framework.
A comprehensive study is carried out to analyse the fundamental aspects of such algorithms, including performance, transferability, computational cost and to larger-sized instances.
arXiv Detail & Related papers (2022-05-03T07:54:56Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Reinforcement Learning with External Knowledge by using Logical Neural
Networks [67.46162586940905]
A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic.
We propose an integrated method that enables model-free reinforcement learning from external knowledge sources.
arXiv Detail & Related papers (2021-03-03T12:34:59Z) - A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I
Learned to Stop Worrying about Mixed-Nash and Love Neural Nets [29.606063890097275]
Adrial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique.
GAN-based generativeplay techniques have been applied to Poker games such as Go.
arXiv Detail & Related papers (2020-02-14T00:17:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.