The Challenge of Appearance-Free Object Tracking with Feedforward Neural
Networks
- URL: http://arxiv.org/abs/2110.02772v1
- Date: Thu, 30 Sep 2021 17:58:53 GMT
- Title: The Challenge of Appearance-Free Object Tracking with Feedforward Neural
Networks
- Authors: Girik Malik, Drew Linsley, Thomas Serre, Ennio Mingolla
- Abstract summary: $itPathTracker$ tests the ability of observers to learn to track objects solely by their motion.
We find that standard 3D-convolutional deep network models struggle to solve this task.
strategies for appearance-free object tracking from biological vision can inspire solutions.
- Score: 12.081808043723937
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Nearly all models for object tracking with artificial neural networks depend
on appearance features extracted from a "backbone" architecture, designed for
object recognition. Indeed, significant progress on object tracking has been
spurred by introducing backbones that are better able to discriminate objects
by their appearance. However, extensive neurophysiology and psychophysics
evidence suggests that biological visual systems track objects using both
appearance and motion features. Here, we introduce $\textit{PathTracker}$, a
visual challenge inspired by cognitive psychology, which tests the ability of
observers to learn to track objects solely by their motion. We find that
standard 3D-convolutional deep network models struggle to solve this task when
clutter is introduced into the generated scenes, or when objects travel long
distances. This challenge reveals that tracing the path of object motion is a
blind spot of feedforward neural networks. We expect that strategies for
appearance-free object tracking from biological vision can inspire solutions
these failures of deep neural networks.
Related papers
- Tracking objects that change in appearance with phase synchrony [14.784044408031098]
We show that a novel deep learning circuit can learn to control attention to features separately from their location in the world through neural synchrony.
We compare object tracking in humans, the CV-RNN, and other deep neural networks (DNNs) using FeatureTracker: a large-scale challenge.
Our CV-RNN behaved similarly to humans on the challenge, providing a computational proof-of-concept for the role of phase synchronization.
arXiv Detail & Related papers (2024-10-02T23:30:05Z) - Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - BI AVAN: Brain inspired Adversarial Visual Attention Network [67.05560966998559]
We propose a brain-inspired adversarial visual attention network (BI-AVAN) to characterize human visual attention directly from functional brain activity.
Our model imitates the biased competition process between attention-related/neglected objects to identify and locate the visual objects in a movie frame the human brain focuses on in an unsupervised manner.
arXiv Detail & Related papers (2022-10-27T22:20:36Z) - Adapting Brain-Like Neural Networks for Modeling Cortical Visual
Prostheses [68.96380145211093]
Cortical prostheses are devices implanted in the visual cortex that attempt to restore lost vision by electrically stimulating neurons.
Currently, the vision provided by these devices is limited, and accurately predicting the visual percepts resulting from stimulation is an open challenge.
We propose to address this challenge by utilizing 'brain-like' convolutional neural networks (CNNs), which have emerged as promising models of the visual system.
arXiv Detail & Related papers (2022-09-27T17:33:19Z) - Learning What and Where -- Unsupervised Disentangling Location and
Identity Tracking [0.44040106718326594]
We introduce an unsupervisedd LOCation and Identity tracking system (Loci)
Inspired by the dorsal-ventral pathways in the brain, Loci tackles the what-and-where binding problem by means of a self-supervised segregation mechanism.
Loci may set the stage for deeper, explanation-oriented video processing.
arXiv Detail & Related papers (2022-05-26T13:30:14Z) - The Right Spin: Learning Object Motion from Rotation-Compensated Flow
Fields [61.664963331203666]
How humans perceive moving objects is a longstanding research question in computer vision.
One approach to the problem is to teach a deep network to model all of these effects.
We present a novel probabilistic model to estimate the camera's rotation given the motion field.
arXiv Detail & Related papers (2022-02-28T22:05:09Z) - Capturing the objects of vision with neural networks [0.0]
Human visual perception carves a scene at its physical joints, decomposing the world into objects.
Deep neural network (DNN) models of visual object recognition, by contrast, remain largely tethered to the sensory input.
We review related work in both fields and examine how these fields can help each other.
arXiv Detail & Related papers (2021-09-07T21:49:53Z) - Tracking Without Re-recognition in Humans and Machines [12.591847867999636]
We investigate if state-of-the-art deep neural networks for visual tracking are capable of the same.
We introduce PathTracker, a synthetic visual challenge that asks human observers and machines to track a target object.
We model circuit mechanisms in biological brains that are implicated in tracking objects based on motion cues.
arXiv Detail & Related papers (2021-05-27T17:56:37Z) - Deep Spiking Convolutional Neural Network for Single Object Localization
Based On Deep Continuous Local Learning [0.0]
We propose a deep convolutional spiking neural network for the localization of a single object in a grayscale image.
Results reported on Oxford-IIIT-Pet validates the exploitation of spiking neural networks with a supervised learning approach.
arXiv Detail & Related papers (2021-05-12T12:02:05Z) - Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects.
We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations.
Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z) - WW-Nets: Dual Neural Networks for Object Detection [48.67090730174743]
We propose a new deep convolutional neural network framework that uses object location knowledge implicit in network connection weights to guide selective attention in object detection tasks.
Our approach is called What-Where Nets (WW-Nets), and it is inspired by the structure of human visual pathways.
arXiv Detail & Related papers (2020-05-15T21:16:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.