Equivariant MuZero
- URL: http://arxiv.org/abs/2302.04798v1
- Date: Thu, 9 Feb 2023 17:46:29 GMT
- Title: Equivariant MuZero
- Authors: Andreea Deac, Th\'eophane Weber, George Papamakarios
- Abstract summary: We propose improving the data efficiency and generalisation capabilities of MuZero by explicitly incorporating the symmetries of the environment in its world-model architecture.
We prove that, so long as the neural networks used by MuZero are equivariant to a particular symmetry group acting on the environment, the entirety of MuZero's action-selection algorithm will also be equivariant to that group.
- Score: 14.027651496499882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep reinforcement learning repeatedly succeeds in closed, well-defined
domains such as games (Chess, Go, StarCraft). The next frontier is real-world
scenarios, where setups are numerous and varied. For this, agents need to learn
the underlying rules governing the environment, so as to robustly generalise to
conditions that differ from those they were trained on. Model-based
reinforcement learning algorithms, such as the highly successful MuZero, aim to
accomplish this by learning a world model. However, leveraging a world model
has not consistently shown greater generalisation capabilities compared to
model-free alternatives. In this work, we propose improving the data efficiency
and generalisation capabilities of MuZero by explicitly incorporating the
symmetries of the environment in its world-model architecture. We prove that,
so long as the neural networks used by MuZero are equivariant to a particular
symmetry group acting on the environment, the entirety of MuZero's
action-selection algorithm will also be equivariant to that group. We evaluate
Equivariant MuZero on procedurally-generated MiniPacman and on Chaser from the
ProcGen suite: training on a set of mazes, and then testing on unseen rotated
versions, demonstrating the benefits of equivariance. Further, we verify that
our performance improvements hold even when only some of the components of
Equivariant MuZero obey strict equivariance, which highlights the robustness of
our construction.
Related papers
- Bootstrap Segmentation Foundation Model under Distribution Shift via Object-Centric Learning [36.77777881242487]
We introduce SlotSAM, a method that reconstructs features from the encoder in a self-supervised manner to create object-centric representations.
These representations are then integrated into the foundation model, bolstering its object-level perceptual capabilities.
arXiv Detail & Related papers (2024-08-29T07:16:28Z) - UniZero: Generalized and Efficient Planning with Scalable Latent World Models [29.648382211926364]
We present textitUniZero, a novel approach that textitdisentangles latent states from implicit latent history using a transformer-based latent world model.
We demonstrate that UniZero, even with single-frame inputs, matches or surpasses the performance of MuZero-style algorithms on the Atari 100k benchmark.
arXiv Detail & Related papers (2024-06-15T15:24:15Z) - MaGGIe: Masked Guided Gradual Human Instance Matting [71.22209465934651]
We propose a new framework MaGGIe, Masked Guided Gradual Human Instance Matting.
It predicts alpha mattes progressively for each human instances while maintaining the computational cost, precision, and consistency.
arXiv Detail & Related papers (2024-04-24T17:59:53Z) - Domain Generalization via Balancing Training Difficulty and Model
Capability [61.053202176230904]
Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains.
Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models.
We design MoDify, a Momentum Difficulty framework that tackles the misalignment by balancing the seesaw between the model's capability and the samples' difficulties.
arXiv Detail & Related papers (2023-09-02T07:09:23Z) - Efficient Equivariant Transfer Learning from Pretrained Models [45.918447685383356]
We show that lambda-equitune averages the features using importance weights, lambdas.
These weights are learned directly from the data using a small neural network.
We prove that lambda-equitune is equivariant and a universal approximator of equivariant functions.
arXiv Detail & Related papers (2023-05-17T02:20:34Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z) - Attribute-Modulated Generative Meta Learning for Zero-Shot
Classification [52.64680991682722]
We present the Attribute-Modulated generAtive meta-model for Zero-shot learning (AMAZ)
Our model consists of an attribute-aware modulation network and an attribute-augmented generative network.
Our empirical evaluations show that AMAZ improves state-of-the-art methods by 3.8% and 5.1% in ZSL and generalized ZSL settings, respectively.
arXiv Detail & Related papers (2021-04-22T04:16:43Z) - Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot
Learning [82.07273754143547]
We propose a meta-continual zero-shot learning (MCZSL) approach to generalizing a model to categories unseen during training.
By pairing self-gating of attributes and scaled class normalization with meta-learning based training, we are able to outperform state-of-the-art results.
arXiv Detail & Related papers (2021-02-23T18:36:14Z) - Complex Momentum for Learning in Games [42.081050296353574]
We generalize gradient descent with momentum for learning in differentiable games to have complex-valued momentum.
We empirically demonstrate that complex-valued momentum can improve convergence in games - like generative adversarial networks.
We also show a practical generalization to a complex-valued Adam variant, which we use to train BigGAN to better scores on CIFAR-10.
arXiv Detail & Related papers (2021-02-16T19:55:27Z) - Improving Model-Based Reinforcement Learning with Internal State
Representations through Self-Supervision [19.37841173522973]
Using a model of the environment, reinforcement learning agents can plan their future moves and achieve performance in board games like Chess, Shogi, and Go.
We show that the environment model can even be learned dynamically, generalizing the agent to many more tasks while at the same time achieving state-of-the-art performance.
Our modifications also enable self-supervised pretraining for MuZero, so the algorithm can learn about environment dynamics before a goal is made available.
arXiv Detail & Related papers (2021-02-10T17:55:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.