Object-Centric Learning with Slot Attention
- URL: http://arxiv.org/abs/2006.15055v2
- Date: Wed, 14 Oct 2020 08:51:40 GMT
- Title: Object-Centric Learning with Slot Attention
- Authors: Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh
Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf
- Abstract summary: We present the Slot Attention module, an architectural component that interfaces with perceptual representations.
Slot Attention produces task-dependent abstract representations which we call slots.
We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions.
- Score: 43.684193749891506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning object-centric representations of complex scenes is a promising step
towards enabling efficient abstract reasoning from low-level perceptual
features. Yet, most deep learning approaches learn distributed representations
that do not capture the compositional properties of natural scenes. In this
paper, we present the Slot Attention module, an architectural component that
interfaces with perceptual representations such as the output of a
convolutional neural network and produces a set of task-dependent abstract
representations which we call slots. These slots are exchangeable and can bind
to any object in the input by specializing through a competitive procedure over
multiple rounds of attention. We empirically demonstrate that Slot Attention
can extract object-centric representations that enable generalization to unseen
compositions when trained on unsupervised object discovery and supervised
property prediction tasks.
Related papers
- Rotating Features for Object Discovery [74.1465486264609]
We present Rotating Features, a generalization of complex-valued features to higher dimensions, and a new evaluation procedure for extracting objects from distributed representations.
Together, these advancements enable us to scale distributed object-centric representations from simple toy to real-world data.
arXiv Detail & Related papers (2023-06-01T12:16:26Z) - Object-centric Learning with Cyclic Walks between Parts and Whole [23.561434374097864]
Learning object-centric representations from complex natural environments enables both humans and machines with reasoning abilities from low-level perceptual features.
We propose cyclic walks between perceptual features extracted from vision transformers and object entities.
In contrast to object-centric models attached with a decoder for the pixel-level or feature-level reconstructions, our cyclic walks provide strong learning signals.
arXiv Detail & Related papers (2023-02-16T01:54:06Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Object Pursuit: Building a Space of Objects via Discriminative Weight
Generation [23.85039747700698]
We propose a framework to continuously learn object-centric representations for visual learning and understanding.
We leverage interactions to sample diverse variations of an object and the corresponding training signals while learning the object-centric representations.
We perform an extensive study of the key features of the proposed framework and analyze the characteristics of the learned representations.
arXiv Detail & Related papers (2021-12-15T08:25:30Z) - Recurrent Attention Models with Object-centric Capsule Representation
for Multi-object Recognition [4.143091738981101]
We show that an object-centric hidden representation in an encoder-decoder model with iterative glimpse attention yields effective integration of attention and recognition.
Our work takes a step toward a general architecture for how to integrate recurrent object-centric representation into the planning of attentional glimpses.
arXiv Detail & Related papers (2021-10-11T01:41:21Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.