What can I do here? A Theory of Affordances in Reinforcement Learning
- URL: http://arxiv.org/abs/2006.15085v1
- Date: Fri, 26 Jun 2020 16:34:53 GMT
- Title: What can I do here? A Theory of Affordances in Reinforcement Learning
- Authors: Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, David Abel, Doina
Precup
- Abstract summary: We develop a theory of affordances for agents who learn and plan in Markov Decision Processes.
Affordances play a dual role in this case, by reducing the number of actions available in any given situation.
We propose an approach to learn affordances and use it to estimate transition models that are simpler and generalize better.
- Score: 65.70524105802156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning algorithms usually assume that all actions are always
available to an agent. However, both people and animals understand the general
link between the features of their environment and the actions that are
feasible. Gibson (1977) coined the term "affordances" to describe the fact that
certain states enable an agent to do certain actions, in the context of
embodied agents. In this paper, we develop a theory of affordances for agents
who learn and plan in Markov Decision Processes. Affordances play a dual role
in this case. On one hand, they allow faster planning, by reducing the number
of actions available in any given situation. On the other hand, they facilitate
more efficient and precise learning of transition models from data, especially
when such models require function approximation. We establish these properties
through theoretical results as well as illustrative examples. We also propose
an approach to learn affordances and use it to estimate transition models that
are simpler and generalize better.
Related papers
- Prioritized Generative Replay [121.83947140497655]
We propose a prioritized, parametric version of an agent's memory, using generative models to capture online experience.
This paradigm enables densification of past experience, with new generations that benefit from the generative model's generalization capacity.
We show this recipe can be instantiated using conditional diffusion models and simple relevance functions.
arXiv Detail & Related papers (2024-10-23T17:59:52Z) - Strategic Classification With Externalities [11.36782598786846]
We propose a new variant of the strategic classification problem.
Motivated by real-world applications, our model crucially allows the manipulation of one agent to affect another.
We show that under certain assumptions, the pure Nash Equilibrium of this agent manipulation game is unique and can be efficiently computed.
arXiv Detail & Related papers (2024-10-10T15:28:04Z) - Accelerating Hybrid Agent-Based Models and Fuzzy Cognitive Maps: How to Combine Agents who Think Alike? [0.0]
We present an approximation that combines agents who think alike', thus reducing the population size and the compute time.
Our innovation relies on representing agent behaviors as networks of rules and empirically evaluating different measures of distance between these networks.
arXiv Detail & Related papers (2024-09-01T19:45:15Z) - On Stateful Value Factorization in Multi-Agent Reinforcement Learning [19.342676562701794]
We introduce Duelmix, a factorization algorithm that learns distinct per-agent utility estimators to improve performance.
Experiments on StarCraft II micromanagement and Box Pushing tasks demonstrate the benefits of our intuitions.
arXiv Detail & Related papers (2024-08-27T19:45:26Z) - EMOTE: An Explainable architecture for Modelling the Other Through
Empathy [26.85666453984719]
We design a simple architecture to model another agent's action-value function.
We learn an "Imagination Network" to transform the other agent's observed state.
This produces a human-interpretable "empathetic state" which, when presented to the learning agent, produces behaviours that mimic the other agent.
arXiv Detail & Related papers (2023-06-01T02:27:08Z) - MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - Characterizing and overcoming the greedy nature of learning in
multi-modal deep neural networks [62.48782506095565]
We show that due to the greedy nature of learning in deep neural networks, models tend to rely on just one modality while under-fitting the other modalities.
We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning.
arXiv Detail & Related papers (2022-02-10T20:11:21Z) - Learning What To Do by Simulating the Past [76.86449554580291]
We show that by combining a learned feature encoder with learned inverse models, we can enable agents to simulate human actions backwards in time to infer what they must have done.
The resulting algorithm is able to reproduce a specific skill in MuJoCo environments given a single state sampled from the optimal policy for that skill.
arXiv Detail & Related papers (2021-04-08T17:43:29Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.