Related papers: Equivariant MuZero

Equivariant MuZero

URL: http://arxiv.org/abs/2302.04798v1
Date: Thu, 9 Feb 2023 17:46:29 GMT
Title: Equivariant MuZero
Authors: Andreea Deac, Th\'eophane Weber, George Papamakarios
Abstract summary: We propose improving the data efficiency and generalisation capabilities of MuZero by explicitly incorporating the symmetries of the environment in its world-model architecture. We prove that, so long as the neural networks used by MuZero are equivariant to a particular symmetry group acting on the environment, the entirety of MuZero's action-selection algorithm will also be equivariant to that group.
Score: 14.027651496499882
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep reinforcement learning repeatedly succeeds in closed, well-defined domains such as games (Chess, Go, StarCraft). The next frontier is real-world scenarios, where setups are numerous and varied. For this, agents need to learn the underlying rules governing the environment, so as to robustly generalise to conditions that differ from those they were trained on. Model-based reinforcement learning algorithms, such as the highly successful MuZero, aim to accomplish this by learning a world model. However, leveraging a world model has not consistently shown greater generalisation capabilities compared to model-free alternatives. In this work, we propose improving the data efficiency and generalisation capabilities of MuZero by explicitly incorporating the symmetries of the environment in its world-model architecture. We prove that, so long as the neural networks used by MuZero are equivariant to a particular symmetry group acting on the environment, the entirety of MuZero's action-selection algorithm will also be equivariant to that group. We evaluate Equivariant MuZero on procedurally-generated MiniPacman and on Chaser from the ProcGen suite: training on a set of mazes, and then testing on unseen rotated versions, demonstrating the benefits of equivariance. Further, we verify that our performance improvements hold even when only some of the components of Equivariant MuZero obey strict equivariance, which highlights the robustness of our construction.

Related papers

Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching [17.76990521486307]
Deep Reinforcement Learning models often fail to generalize when even small changes occur in the environment's observations or task requirements. We propose a zero-shot method for mapping between latent spaces across different agents trained on different visual and task variations. We empirically demonstrate zero-shot stitching performance on the CarRacing environment with changing background and task.
arXiv Detail & Related papers (2025-02-26T22:06:00Z)
Bootstrap Segmentation Foundation Model under Distribution Shift via Object-Centric Learning [36.77777881242487]
We introduce SlotSAM, a method that reconstructs features from the encoder in a self-supervised manner to create object-centric representations. These representations are then integrated into the foundation model, bolstering its object-level perceptual capabilities.
arXiv Detail & Related papers (2024-08-29T07:16:28Z)
Improving Equivariant Model Training via Constraint Relaxation [31.507956579770088]
We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training. We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
arXiv Detail & Related papers (2024-08-23T17:35:08Z)
UniZero: Generalized and Efficient Planning with Scalable Latent World Models [29.648382211926364]
We present textitUniZero, a novel approach that textitdisentangles latent states from implicit latent history using a transformer-based latent world model. We demonstrate that UniZero, even with single-frame inputs, matches or surpasses the performance of MuZero-style algorithms on the Atari 100k benchmark.
arXiv Detail & Related papers (2024-06-15T15:24:15Z)
MaGGIe: Masked Guided Gradual Human Instance Matting [71.22209465934651]
We propose a new framework MaGGIe, Masked Guided Gradual Human Instance Matting. It predicts alpha mattes progressively for each human instances while maintaining the computational cost, precision, and consistency.
arXiv Detail & Related papers (2024-04-24T17:59:53Z)
Domain Generalization via Balancing Training Difficulty and Model Capability [61.053202176230904]
Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains. Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models. We design MoDify, a Momentum Difficulty framework that tackles the misalignment by balancing the seesaw between the model's capability and the samples' difficulties.
arXiv Detail & Related papers (2023-09-02T07:09:23Z)
Efficient Equivariant Transfer Learning from Pretrained Models [45.918447685383356]
We show that lambda-equitune averages the features using importance weights, lambdas. These weights are learned directly from the data using a small neural network. We prove that lambda-equitune is equivariant and a universal approximator of equivariant functions.
arXiv Detail & Related papers (2023-05-17T02:20:34Z)
Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks. We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems. We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z)
Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. We show that FA-based models have maximal expressive power in a broad setting. We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z)
Attribute-Modulated Generative Meta Learning for Zero-Shot Classification [52.64680991682722]
We present the Attribute-Modulated generAtive meta-model for Zero-shot learning (AMAZ) Our model consists of an attribute-aware modulation network and an attribute-augmented generative network. Our empirical evaluations show that AMAZ improves state-of-the-art methods by 3.8% and 5.1% in ZSL and generalized ZSL settings, respectively.
arXiv Detail & Related papers (2021-04-22T04:16:43Z)
Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning [82.07273754143547]
We propose a meta-continual zero-shot learning (MCZSL) approach to generalizing a model to categories unseen during training. By pairing self-gating of attributes and scaled class normalization with meta-learning based training, we are able to outperform state-of-the-art results.
arXiv Detail & Related papers (2021-02-23T18:36:14Z)
Complex Momentum for Learning in Games [42.081050296353574]
We generalize gradient descent with momentum for learning in differentiable games to have complex-valued momentum. We empirically demonstrate that complex-valued momentum can improve convergence in games - like generative adversarial networks. We also show a practical generalization to a complex-valued Adam variant, which we use to train BigGAN to better scores on CIFAR-10.
arXiv Detail & Related papers (2021-02-16T19:55:27Z)
Improving Model-Based Reinforcement Learning with Internal State Representations through Self-Supervision [19.37841173522973]
Using a model of the environment, reinforcement learning agents can plan their future moves and achieve performance in board games like Chess, Shogi, and Go. We show that the environment model can even be learned dynamically, generalizing the agent to many more tasks while at the same time achieving state-of-the-art performance. Our modifications also enable self-supervised pretraining for MuZero, so the algorithm can learn about environment dynamics before a goal is made available.
arXiv Detail & Related papers (2021-02-10T17:55:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.