Self-supervised Visual Reinforcement Learning with Object-centric
Representations
- URL: http://arxiv.org/abs/2011.14381v1
- Date: Sun, 29 Nov 2020 14:55:09 GMT
- Title: Self-supervised Visual Reinforcement Learning with Object-centric
Representations
- Authors: Andrii Zadaianchuk, Maximilian Seitzer, Georg Martius
- Abstract summary: We propose to use object-centric representations as a modular and structured observation space.
We show that the structure in the representations in combination with goal-conditioned attention policies helps the autonomous agent to discover and learn useful skills.
- Score: 11.786249372283562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous agents need large repertoires of skills to act reasonably on new
tasks that they have not seen before. However, acquiring these skills using
only a stream of high-dimensional, unstructured, and unlabeled observations is
a tricky challenge for any autonomous agent. Previous methods have used
variational autoencoders to encode a scene into a low-dimensional vector that
can be used as a goal for an agent to discover new skills. Nevertheless, in
compositional/multi-object environments it is difficult to disentangle all the
factors of variation into such a fixed-length representation of the whole
scene. We propose to use object-centric representations as a modular and
structured observation space, which is learned with a compositional generative
world model. We show that the structure in the representations in combination
with goal-conditioned attention policies helps the autonomous agent to discover
and learn useful skills. These skills can be further combined to address
compositional tasks like the manipulation of several different objects.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Learning Reusable Manipulation Strategies [86.07442931141634]
Humans demonstrate an impressive ability to acquire and generalize manipulation "tricks"
We present a framework that enables machines to acquire such manipulation skills through a single demonstration and self-play.
These learned mechanisms and samplers can be seamlessly integrated into standard task and motion planners.
arXiv Detail & Related papers (2023-11-06T17:35:42Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Homomorphism Autoencoder -- Learning Group Structured Representations from Observed Transitions [51.71245032890532]
We propose methods enabling an agent acting upon the world to learn internal representations of sensory information consistent with actions that modify it.
In contrast to existing work, our approach does not require prior knowledge of the group and does not restrict the set of actions the agent can perform.
arXiv Detail & Related papers (2022-07-25T11:22:48Z) - Stochastic Coherence Over Attention Trajectory For Continuous Learning
In Video Streams [64.82800502603138]
This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream.
The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations.
Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream.
arXiv Detail & Related papers (2022-04-26T09:52:31Z) - Compositional Multi-Object Reinforcement Learning with Linear Relation
Networks [38.59852895970774]
We focus on models that can learn manipulation tasks in fixed multi-object settings and extrapolate this skill zero-shot without any drop in performance when the number of objects changes.
Our approach, which scales linearly in $K$, allows agents to extrapolate and generalize zero-shot to any new object number.
arXiv Detail & Related papers (2022-01-31T17:53:30Z) - Object Pursuit: Building a Space of Objects via Discriminative Weight
Generation [23.85039747700698]
We propose a framework to continuously learn object-centric representations for visual learning and understanding.
We leverage interactions to sample diverse variations of an object and the corresponding training signals while learning the object-centric representations.
We perform an extensive study of the key features of the proposed framework and analyze the characteristics of the learned representations.
arXiv Detail & Related papers (2021-12-15T08:25:30Z) - Self-supervised Reinforcement Learning with Independently Controllable
Subgoals [20.29444813790076]
Self-supervised agents set their own goals by exploiting the structure in the environment.
Some of them were applied to learn basic manipulation skills in compositional multi-object environments.
We propose a novel self-supervised agent that estimates relations between environment components and uses them to independently control different parts of the environment state.
arXiv Detail & Related papers (2021-09-09T10:21:02Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - Domain-Robust Visual Imitation Learning with Mutual Information
Constraints [0.0]
We introduce a new algorithm called Disentangling Generative Adversarial Imitation Learning (DisentanGAIL)
Our algorithm enables autonomous agents to learn directly from high dimensional observations of an expert performing a task.
arXiv Detail & Related papers (2021-03-08T21:18:58Z) - Object-Centric Learning with Slot Attention [43.684193749891506]
We present the Slot Attention module, an architectural component that interfaces with perceptual representations.
Slot Attention produces task-dependent abstract representations which we call slots.
We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions.
arXiv Detail & Related papers (2020-06-26T15:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.