Learning What and Where -- Unsupervised Disentangling Location and
Identity Tracking
- URL: http://arxiv.org/abs/2205.13349v1
- Date: Thu, 26 May 2022 13:30:14 GMT
- Title: Learning What and Where -- Unsupervised Disentangling Location and
Identity Tracking
- Authors: Manuel Traub, Sebastian Otte, Tobias Menge, Matthias Karlbauer, Jannik
Th\"ummel, Martin V. Butz
- Abstract summary: We introduce an unsupervisedd LOCation and Identity tracking system (Loci)
Inspired by the dorsal-ventral pathways in the brain, Loci tackles the what-and-where binding problem by means of a self-supervised segregation mechanism.
Loci may set the stage for deeper, explanation-oriented video processing.
- Score: 0.44040106718326594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our brain can almost effortlessly decompose visual data streams into
background and salient objects. Moreover, it can track the objects and
anticipate their motion and interactions. In contrast, recent object reasoning
datasets, such as CATER, have revealed fundamental shortcomings of current
vision-based AI systems, particularly when targeting explicit object encodings,
object permanence, and object reasoning. We introduce an unsupervised
disentangled LOCation and Identity tracking system (Loci), which excels on the
CATER tracking challenge. Inspired by the dorsal-ventral pathways in the brain,
Loci tackles the what-and-where binding problem by means of a self-supervised
segregation mechanism. Our autoregressive neural network partitions and
distributes the visual input stream across separate, identically-parameterized
and autonomously recruited neural network modules. Each module binds what with
where, that is, compressed Gestalt encodings with locations. On the deep latent
encoding levels interaction dynamics are processed. Besides exhibiting superior
performance in current benchmarks, we propose that Loci may set the stage for
deeper, explanation-oriented video processing -- akin to some deeper networked
processes in the brain that appear to integrate individual entity and
spatiotemporal interaction dynamics into event structures.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.