Cycle Consistency Driven Object Discovery
- URL: http://arxiv.org/abs/2306.02204v2
- Date: Thu, 7 Dec 2023 22:39:21 GMT
- Title: Cycle Consistency Driven Object Discovery
- Authors: Aniket Didolkar, Anirudh Goyal, Yoshua Bengio
- Abstract summary: We introduce a method that explicitly optimize the constraint that each object in a scene should be associated with a distinct slot.
By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance.
Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.
- Score: 75.60399804639403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing deep learning models that effectively learn object-centric
representations, akin to human cognition, remains a challenging task. Existing
approaches facilitate object discovery by representing objects as fixed-size
vectors, called ``slots'' or ``object files''. While these approaches have
shown promise in certain scenarios, they still exhibit certain limitations.
First, they rely on architectural priors which can be unreliable and usually
require meticulous engineering to identify the correct objects. Second, there
has been a notable gap in investigating the practical utility of these
representations in downstream tasks. To address the first limitation, we
introduce a method that explicitly optimizes the constraint that each object in
a scene should be associated with a distinct slot. We formalize this constraint
by introducing consistency objectives which are cyclic in nature. By
integrating these consistency objectives into various existing slot-based
object-centric methods, we showcase substantial improvements in
object-discovery performance. These enhancements consistently hold true across
both synthetic and real-world scenes, underscoring the effectiveness and
adaptability of the proposed approach. To tackle the second limitation, we
apply the learned object-centric representations from the proposed method to
two downstream reinforcement learning tasks, demonstrating considerable
performance enhancements compared to conventional slot-based and monolithic
representation learning methods. Our results suggest that the proposed approach
not only improves object discovery, but also provides richer features for
downstream tasks.
Related papers
- Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction [19.59151245929067]
We study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning.
We find this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels.
We make use of this to propose a fully model-based algorithm that learns a discriminative world model.
arXiv Detail & Related papers (2024-08-21T17:59:31Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Object-Centric Conformance Alignments with Synchronization (Extended Version) [57.76661079749309]
We present a new formalism that combines the ability of object-centric Petri nets to capture one-to-many relations and the one of Petri nets with identifiers to compare and synchronize objects based on their identity.
We propose a conformance checking approach for such nets based on an encoding in satisfiability modulo theories (SMT)
arXiv Detail & Related papers (2023-12-13T21:53:32Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Boosting Object Representation Learning via Motion and Object Continuity [22.512380611375846]
We propose to exploit object motion and continuity, i.e., objects do not pop in and out of existence.
The resulting Motion and Object Continuity scheme can be instantiated using any baseline object detection model.
Our results show large improvements in the performances of a SOTA model in terms of object discovery, convergence speed and overall latent object representations.
arXiv Detail & Related papers (2022-11-16T09:36:41Z) - Tackling Background Distraction in Video Object Segmentation [7.187425003801958]
A video object segmentation (VOS) aims to densely track certain objects in videos.
One of the main challenges in this task is the existence of background distractors that appear similar to the target objects.
We propose three novel strategies to suppress such distractors.
Our model achieves a comparable performance to contemporary state-of-the-art approaches, even with real-time performance.
arXiv Detail & Related papers (2022-07-14T14:25:19Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Object Pursuit: Building a Space of Objects via Discriminative Weight
Generation [23.85039747700698]
We propose a framework to continuously learn object-centric representations for visual learning and understanding.
We leverage interactions to sample diverse variations of an object and the corresponding training signals while learning the object-centric representations.
We perform an extensive study of the key features of the proposed framework and analyze the characteristics of the learned representations.
arXiv Detail & Related papers (2021-12-15T08:25:30Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Relevance-Guided Modeling of Object Dynamics for Reinforcement Learning [0.0951828574518325]
Current deep reinforcement learning (RL) approaches incorporate minimal prior knowledge about the environment.
We propose a framework for reasoning about object dynamics and behavior to rapidly determine minimal and task-specific object representations.
We also highlight the potential of this framework on several Atari games, using our object representation and standard RL and planning algorithms to learn dramatically faster than existing deep RL algorithms.
arXiv Detail & Related papers (2020-03-03T08:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.