Object Permanence Emerges in a Random Walk along Memory
- URL: http://arxiv.org/abs/2204.01784v1
- Date: Mon, 4 Apr 2022 18:28:24 GMT
- Title: Object Permanence Emerges in a Random Walk along Memory
- Authors: Pavel Tokmakov, Allan Jabri, Jie Li, Adrien Gaidon
- Abstract summary: We show that object permanence can emerge by optimizing for temporal coherence of memory.
This leads to a memory representation that stores occluded objects and predicts their motion, to better localize them.
The resulting model outperforms existing approaches on several datasets of increasing complexity and realism.
- Score: 37.78331373391444
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper proposes a self-supervised objective for learning representations
that localize objects under occlusion - a property known as object permanence.
A central question is the choice of learning signal in cases of total
occlusion. Rather than directly supervising the locations of invisible objects,
we propose a self-supervised objective that requires neither human annotation,
nor assumptions about object dynamics. We show that object permanence can
emerge by optimizing for temporal coherence of memory: we fit a Markov walk
along a space-time graph of memories, where the states in each time step are
non-Markovian features from a sequence encoder. This leads to a memory
representation that stores occluded objects and predicts their motion, to
better localize them. The resulting model outperforms existing approaches on
several datasets of increasing complexity and realism, despite requiring
minimal supervision and assumptions, and hence being broadly applicable.
Related papers
- Tracking-Assisted Object Detection with Event Cameras [16.408606403997005]
Event-based object detection has recently garnered attention in the computer vision community.
However, feature asynchronism and sparsity cause invisible objects due to no relative motion to the camera.
In this paper, we consider those invisible objects as pseudo-occluded objects.
We exploit tracking strategies for pseudo-occluded objects to maintain their permanence and retain their bounding boxes.
arXiv Detail & Related papers (2024-03-27T08:11:25Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - Spotlight Attention: Robust Object-Centric Learning With a Spatial
Locality Prior [88.9319150230121]
Object-centric vision aims to construct an explicit representation of the objects in a scene.
We incorporate a spatial-locality prior into state-of-the-art object-centric vision models.
We obtain significant improvements in segmenting objects in both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-31T04:35:50Z) - Tackling Background Distraction in Video Object Segmentation [7.187425003801958]
A video object segmentation (VOS) aims to densely track certain objects in videos.
One of the main challenges in this task is the existence of background distractors that appear similar to the target objects.
We propose three novel strategies to suppress such distractors.
Our model achieves a comparable performance to contemporary state-of-the-art approaches, even with real-time performance.
arXiv Detail & Related papers (2022-07-14T14:25:19Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - Recurrent Attention Models with Object-centric Capsule Representation
for Multi-object Recognition [4.143091738981101]
We show that an object-centric hidden representation in an encoder-decoder model with iterative glimpse attention yields effective integration of attention and recognition.
Our work takes a step toward a general architecture for how to integrate recurrent object-centric representation into the planning of attentional glimpses.
arXiv Detail & Related papers (2021-10-11T01:41:21Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.