Generalization and Robustness Implications in Object-Centric Learning
- URL: http://arxiv.org/abs/2107.00637v1
- Date: Thu, 1 Jul 2021 17:51:11 GMT
- Title: Generalization and Robustness Implications in Object-Centric Learning
- Authors: Andrea Dittadi, Samuele Papa, Michele De Vita, Bernhard Sch\"olkopf,
Ole Winther, Francesco Locatello
- Abstract summary: In this paper, we train state-of-the-art unsupervised models on five common multi-object datasets.
From our experimental study, we find object-centric representations to be generally useful for downstream tasks.
- Score: 23.021791024676986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The idea behind object-centric representation learning is that natural scenes
can better be modeled as compositions of objects and their relations as opposed
to distributed representations. This inductive bias can be injected into neural
networks to potentially improve systematic generalization and learning
efficiency of downstream tasks in scenes with multiple objects. In this paper,
we train state-of-the-art unsupervised models on five common multi-object
datasets and evaluate segmentation accuracy and downstream object property
prediction. In addition, we study systematic generalization and robustness by
investigating the settings where either single objects are out-of-distribution
-- e.g., having unseen colors, textures, and shapes -- or global properties of
the scene are altered -- e.g., by occlusions, cropping, or increasing the
number of objects. From our experimental study, we find object-centric
representations to be generally useful for downstream tasks and robust to
shifts in the data distribution, especially if shifts affect single objects.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Object-centric architectures enable efficient causal representation
learning [51.6196391784561]
We show that when the observations are of multiple objects, the generative function is no longer injective and disentanglement fails in practice.
We develop an object-centric architecture that leverages weak supervision from sparse perturbations to disentangle each object's properties.
This approach is more data-efficient in the sense that it requires significantly fewer perturbations than a comparable approach that encodes to a Euclidean space.
arXiv Detail & Related papers (2023-10-29T16:01:03Z) - Compositional Scene Modeling with Global Object-Centric Representations [44.43366905943199]
Humans can easily identify the same object, even if occlusions exist, by completing the occluded parts based on its canonical image in the memory.
This paper proposes a compositional scene modeling method to infer global representations of canonical images of objects without any supervision.
arXiv Detail & Related papers (2022-11-21T14:36:36Z) - Robust and Controllable Object-Centric Learning through Energy-based
Models [95.68748828339059]
ours is a conceptually simple and general approach to learning object-centric representations through an energy-based model.
We show that ours can be easily integrated into existing architectures and can effectively extract high-quality object-centric representations.
arXiv Detail & Related papers (2022-10-11T15:11:15Z) - Inductive Biases for Object-Centric Representations of Complex Textures [13.045904773946367]
We use neural style transfer to generate datasets where objects have complex textures while still retaining ground-truth annotations.
We find that, when a model effectively balances the importance of shape and appearance in the training objective, it can achieve better separation of the objects and learn more useful object representations.
arXiv Detail & Related papers (2022-04-18T17:34:37Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Towards Self-Supervised Learning of Global and Object-Centric
Representations [4.36572039512405]
We discuss key aspects of learning structured object-centric representations with self-supervision.
We validate our insights through several experiments on the CLEVR dataset.
arXiv Detail & Related papers (2022-03-11T15:18:47Z) - Object Pursuit: Building a Space of Objects via Discriminative Weight
Generation [23.85039747700698]
We propose a framework to continuously learn object-centric representations for visual learning and understanding.
We leverage interactions to sample diverse variations of an object and the corresponding training signals while learning the object-centric representations.
We perform an extensive study of the key features of the proposed framework and analyze the characteristics of the learned representations.
arXiv Detail & Related papers (2021-12-15T08:25:30Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - Global-Local Bidirectional Reasoning for Unsupervised Representation
Learning of 3D Point Clouds [109.0016923028653]
We learn point cloud representation by bidirectional reasoning between the local structures and the global shape without human supervision.
We show that our unsupervised model surpasses the state-of-the-art supervised methods on both synthetic and real-world 3D object classification datasets.
arXiv Detail & Related papers (2020-03-29T08:26:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.