Related papers: Meta-trained agents implement Bayes-optimal agents

Meta-trained agents implement Bayes-optimal agents

URL: http://arxiv.org/abs/2010.11223v1
Date: Wed, 21 Oct 2020 18:05:21 GMT
Title: Meta-trained agents implement Bayes-optimal agents
Authors: Vladimir Mikulik, Gr\'egoire Del\'etang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro A. Ortega
Abstract summary: We show that memory-based meta-learning might serve as a technique for numerically approximating Bayes-optimal agents. Our results suggest that memory-based meta-learning might serve as a general technique for numerically approximating Bayes-optimal agents.
Score: 13.572630988699572
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Memory-based meta-learning is a powerful technique to build agents that adapt fast to any task within a target distribution. A previous theoretical study has argued that this remarkable performance is because the meta-training protocol incentivises agents to behave Bayes-optimally. We empirically investigate this claim on a number of prediction and bandit tasks. Inspired by ideas from theoretical computer science, we show that meta-learned and Bayes-optimal agents not only behave alike, but they even share a similar computational structure, in the sense that one agent system can approximately simulate the other. Furthermore, we show that Bayes-optimal agents are fixed points of the meta-learning dynamics. Our results suggest that memory-based meta-learning might serve as a general technique for numerically approximating Bayes-optimal agents - that is, even for task distributions for which we currently don't possess tractable models.

Related papers

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent [57.10083973844841]
AgentArk is a novel framework to distill multi-agent dynamics into the weights of a single model.<n>We investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios.<n>By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents.
arXiv Detail & Related papers (2026-02-03T19:18:28Z)
Predictive Coding Enhances Meta-RL To Achieve Interpretable Bayes-Optimal Belief Representation Under Partial Observability [10.548824172738227]
Learning a compact representation of history is critical for planning and generalization in partially observable environments.<n>We show that meta-reinforcement learning (RL) agents can attain near Bayes-optimal policies, but often fail to learn the compact, interpretable Bayes-optimal belief states.<n>We investigate whether integrating self-supervised predictive coding modules into meta-RL can facilitate learning of Bayes-optimal representations.
arXiv Detail & Related papers (2025-10-24T21:45:56Z)
ContraBAR: Contrastive Bayes-Adaptive Deep RL [22.649531458557206]
In meta reinforcement learning (meta RL), an agent seeks a Bayes-optimal policy -- the optimal policy when facing an unknown task. We investigate whether contrastive methods can be used for learning Bayes-optimal behavior. We propose a simple meta RL algorithm that uses contrastive predictive coding (CPC) in lieu of variational belief inference.
arXiv Detail & Related papers (2023-06-04T17:50:20Z)
MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes. This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people. We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z)
Bayesian Meta-Learning Through Variational Gaussian Processes [0.0]
We extend Gaussian-process-based meta-learning to allow for high-quality, arbitrary non-Gaussian uncertainty predictions. Our method performs significantly better than existing Bayesian meta-learning baselines.
arXiv Detail & Related papers (2021-10-21T10:44:23Z)
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework. We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z)
What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm" We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z)
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework. We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior. We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z)
Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML. We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.