Discovering Generalizable Spatial Goal Representations via Graph-based
Active Reward Learning
- URL: http://arxiv.org/abs/2211.15339v1
- Date: Thu, 24 Nov 2022 18:59:06 GMT
- Title: Discovering Generalizable Spatial Goal Representations via Graph-based
Active Reward Learning
- Authors: Aviv Netanyahu, Tianmin Shu, Joshua Tenenbaum, Pulkit Agrawal
- Abstract summary: We propose a reward learning approach, Graph-based Equivalence Mappings (GEM)
GEM represents a spatial goal specification by a reward function conditioned on i) a graph indicating important spatial relationships between objects and ii) state equivalence mappings for each edge in the graph.
We show that GEM can drastically improve the generalizability of the learned goal representations over strong baselines.
- Score: 17.58129740811116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we consider one-shot imitation learning for object
rearrangement tasks, where an AI agent needs to watch a single expert
demonstration and learn to perform the same task in different environments. To
achieve a strong generalization, the AI agent must infer the spatial goal
specification for the task. However, there can be multiple goal specifications
that fit the given demonstration. To address this, we propose a reward learning
approach, Graph-based Equivalence Mappings (GEM), that can discover spatial
goal representations that are aligned with the intended goal specification,
enabling successful generalization in unseen environments. Specifically, GEM
represents a spatial goal specification by a reward function conditioned on i)
a graph indicating important spatial relationships between objects and ii)
state equivalence mappings for each edge in the graph indicating invariant
properties of the corresponding relationship. GEM combines inverse
reinforcement learning and active reward learning to efficiently improve the
reward function by utilizing the graph structure and domain randomization
enabled by the equivalence mappings. We conducted experiments with simulated
oracles and with human subjects. The results show that GEM can drastically
improve the generalizability of the learned goal representations over strong
baselines.
Related papers
- Improving Node Representation by Boosting Target-Aware Contrastive Loss [10.73390567832967]
We introduce Target-Aware Contrastive Learning (Target-aware CL) to enhance target task performance.
By minimizing XTCL, Target-aware CL increases the mutual information between the target task and node representations.
We show experimentally that XTCL significantly improves the performance on two target tasks.
arXiv Detail & Related papers (2024-10-04T20:08:24Z) - GOMAA-Geo: GOal Modality Agnostic Active Geo-localization [49.599465495973654]
We consider the task of active geo-localization (AGL) in which an agent uses a sequence of visual cues observed during aerial navigation to find a target specified through multiple possible modalities.
GOMAA-Geo is a goal modality active geo-localization agent for zero-shot generalization between different goal modalities.
arXiv Detail & Related papers (2024-06-04T02:59:36Z) - Goal Space Abstraction in Hierarchical Reinforcement Learning via
Set-Based Reachability Analysis [0.5409704301731713]
We introduce a Feudal HRL algorithm that concurrently learns both the goal representation and a hierarchical policy.
We evaluate our approach on complex navigation tasks, showing the learned representation is interpretable, transferrable and results in data efficient learning.
arXiv Detail & Related papers (2023-09-14T12:39:26Z) - Top-Down Visual Attention from Analysis by Synthesis [87.47527557366593]
We consider top-down attention from a classic Analysis-by-Synthesis (AbS) perspective of vision.
We propose Analysis-by-Synthesis Vision Transformer (AbSViT), which is a top-down modulated ViT model that variationally approximates AbS, and controllable achieves top-down attention.
arXiv Detail & Related papers (2023-03-23T05:17:05Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity [81.38804205212425]
We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
arXiv Detail & Related papers (2022-07-06T08:46:27Z) - Weakly Supervised Disentangled Representation for Goal-conditioned
Reinforcement Learning [15.698612710580447]
We propose a skill learning framework DR-GRL that aims to improve the sample efficiency and policy generalization.
In a weakly supervised manner, we propose a Spatial Transform AutoEncoder (STAE) to learn an interpretable and controllable representation.
We empirically demonstrate that DR-GRL significantly outperforms the previous methods in sample efficiency and policy generalization.
arXiv Detail & Related papers (2022-02-28T09:05:14Z) - Soft Hierarchical Graph Recurrent Networks for Many-Agent Partially
Observable Environments [9.067091068256747]
We propose a novel network structure called hierarchical graph recurrent network(HGRN) for multi-agent cooperation under partial observability.
Based on the above technologies, we proposed a value-based MADRL algorithm called Soft-HGRN and its actor-critic variant named SAC-HRGN.
arXiv Detail & Related papers (2021-09-05T09:51:25Z) - Mutual Graph Learning for Camouflaged Object Detection [31.422775969808434]
A major challenge is that intrinsic similarities between foreground objects and background surroundings make the features extracted by deep model indistinguishable.
We design a novel Mutual Graph Learning model, which generalizes the idea of conventional mutual learning from regular grids to the graph domain.
In contrast to most mutual learning approaches that use a shared function to model all between-task interactions, MGL is equipped with typed functions for handling different complementary relations.
arXiv Detail & Related papers (2021-04-03T10:14:39Z) - Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision.
We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs.
Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z) - ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
Interaction Detection [101.56529337489417]
We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of human, action, object> in images.
We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs.
Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities.
arXiv Detail & Related papers (2020-08-14T09:11:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.