Affordance as general value function: A computational model
- URL: http://arxiv.org/abs/2010.14289v3
- Date: Sat, 8 May 2021 00:15:11 GMT
- Title: Affordance as general value function: A computational model
- Authors: Daniel Graves, Johannes G\"unther, Jun Luo
- Abstract summary: General value functions (GVFs) are long-term predictive summaries of the outcomes of agents following specific policies in the environment.
We show that GVFs realize affordance prediction as a form of direct perception.
We demonstrate that GVFs provide the right framework for learning affordances in real-world applications.
- Score: 8.34897697233928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: General value functions (GVFs) in the reinforcement learning (RL) literature
are long-term predictive summaries of the outcomes of agents following specific
policies in the environment. Affordances as perceived action possibilities with
specific valence may be cast into predicted policy-relative goodness and
modelled as GVFs. A systematic explication of this connection shows that GVFs
and especially their deep learning embodiments (1) realize affordance
prediction as a form of direct perception, (2) illuminate the fundamental
connection between action and perception in affordance, and (3) offer a
scalable way to learn affordances using RL methods. Through an extensive review
of existing literature on GVF applications and representative affordance
research in robotics, we demonstrate that GVFs provide the right framework for
learning affordances in real-world applications. In addition, we highlight a
few new avenues of research opened up by the perspective of "affordance as
GVF", including using GVFs for orchestrating complex behaviors.
Related papers
- Causality-based Cross-Modal Representation Learning for
Vision-and-Language Navigation [15.058687283978077]
Vision-and-Language Navigation (VLN) has gained significant research interest in recent years due to its potential applications in real-world scenarios.
Existing VLN methods struggle with the issue of spurious associations, resulting in poor generalization with a significant performance gap between seen and unseen environments.
We propose a unified framework CausalVLN based on the causal learning paradigm to train a robust navigator capable of learning unbiased feature representations.
arXiv Detail & Related papers (2024-03-06T02:01:38Z) - A General Theoretical Paradigm to Understand Learning from Human
Preferences [33.65903139056413]
We derive a new general objective called $Psi$PO for learning from human preferences that is expressed in terms of pairwise preferences.
This new general objective allows us to perform an in-depth analysis of the behavior of RLHF and DPO.
arXiv Detail & Related papers (2023-10-18T15:21:28Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Generalizing Goal-Conditioned Reinforcement Learning with Variational
Causal Reasoning [24.09547181095033]
Causal Graph is a structure built upon the relation between objects and events.
We propose a framework with theoretical performance guarantees that alternates between two steps.
Our performance improvement is attributed to the virtuous cycle of causal discovery, transition modeling, and policy training.
arXiv Detail & Related papers (2022-07-19T05:31:16Z) - A Unified Off-Policy Evaluation Approach for General Value Function [131.45028999325797]
General Value Function (GVF) is a powerful tool to represent both predictive and retrospective knowledge in reinforcement learning (RL)
In this paper, we propose a new algorithm called GenTD for off-policy GVFs evaluation.
We show that GenTD learns multiple interrelated multi-dimensional GVFs as efficiently as a single canonical scalar value function.
arXiv Detail & Related papers (2021-07-06T16:20:34Z) - Which Mutual-Information Representation Learning Objectives are
Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data.
This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy.
Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z) - Variational Empowerment as Representation Learning for Goal-Based
Reinforcement Learning [114.07623388322048]
We discuss how the standard goal-conditioned RL (GCRL) is encapsulated by the objective variational empowerment.
Our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.
arXiv Detail & Related papers (2021-06-02T18:12:26Z) - Off-Policy Imitation Learning from Observations [78.30794935265425]
Learning from Observations (LfO) is a practical reinforcement learning scenario from which many applications can benefit.
We propose a sample-efficient LfO approach that enables off-policy optimization in a principled manner.
Our approach is comparable with state-of-the-art locomotion in terms of both sample-efficiency and performance.
arXiv Detail & Related papers (2021-02-25T21:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.