Ventral-Dorsal Neural Networks: Object Detection via Selective Attention
- URL: http://arxiv.org/abs/2005.09727v1
- Date: Fri, 15 May 2020 23:57:36 GMT
- Title: Ventral-Dorsal Neural Networks: Object Detection via Selective Attention
- Authors: Mohammad K. Ebrahimpour, Jiayun Li, Yen-Yun Yu, Jackson L. Reese,
Azadeh Moghtaderi, Ming-Hsuan Yang, David C. Noelle
- Abstract summary: We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
- Score: 51.79577908317031
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Convolutional Neural Networks (CNNs) have been repeatedly proven to
perform well on image classification tasks. Object detection methods, however,
are still in need of significant improvements. In this paper, we propose a new
framework called Ventral-Dorsal Networks (VDNets) which is inspired by the
structure of the human visual system. Roughly, the visual input signal is
analyzed along two separate neural streams, one in the temporal lobe and the
other in the parietal lobe. The coarse functional distinction between these
streams is between object recognition -- the "what" of the signal -- and
extracting location related information -- the "where" of the signal. The
ventral pathway from primary visual cortex, entering the temporal lobe, is
dominated by "what" information, while the dorsal pathway, into the parietal
lobe, is dominated by "where" information. Inspired by this structure, we
propose the integration of a "Ventral Network" and a "Dorsal Network", which
are complementary. Information about object identity can guide localization,
and location information can guide attention to relevant image regions,
improving object recognition. This new dual network framework sharpens the
focus of object detection. Our experimental results reveal that the proposed
method outperforms state-of-the-art object detection approaches on PASCAL VOC
2007 by 8% (mAP) and PASCAL VOC 2012 by 3% (mAP). Moreover, a comparison of
techniques on Yearbook images displays substantial qualitative and quantitative
benefits of VDNet.
Related papers
- Unleashing the Power of Depth and Pose Estimation Neural Networks by
Designing Compatible Endoscopic Images [12.412060445862842]
We conduct a detail analysis of the properties of endoscopic images and improve the compatibility of images and neural networks.
First, we introcude the Mask Image Modelling (MIM) module, which inputs partial image information instead of complete image information.
Second, we propose a lightweight neural network to enhance the endoscopic images, to explicitly improve the compatibility between images and neural networks.
arXiv Detail & Related papers (2023-09-14T02:19:38Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained
Image Categorization [24.286426387100423]
We propose a method that captures subtle changes by aggregating context-aware features from most relevant image-regions.
Our approach is inspired by the recent advancement in self-attention and graph neural networks (GNNs)
It outperforms the state-of-the-art approaches by a significant margin in recognition accuracy.
arXiv Detail & Related papers (2022-09-05T19:43:15Z) - Saccade Mechanisms for Image Classification, Object Detection and
Tracking [12.751552698602744]
We examine how the saccade mechanism from biological vision can be used to make deep neural networks more efficient for classification and object detection problems.
Our proposed approach is based on the ideas of attention-driven visual processing and saccades, miniature eye movements influenced by attention.
arXiv Detail & Related papers (2022-06-10T13:50:34Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Network Comparison Study of Deep Activation Feature Discriminability
with Novel Objects [0.5076419064097732]
State-of-the-art computer visions algorithms have incorporated Deep Neural Networks (DNN) in feature extracting roles, creating Deep Convolutional Activation Features (DeCAF)
This study analyzes the general discriminability of novel object visual appearances encoded into the DeCAF space of six of the leading visual recognition DNN architectures.
arXiv Detail & Related papers (2022-02-08T07:40:53Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - WW-Nets: Dual Neural Networks for Object Detection [48.67090730174743]
We propose a new deep convolutional neural network framework that uses object location knowledge implicit in network connection weights to guide selective attention in object detection tasks.
Our approach is called What-Where Nets (WW-Nets), and it is inspired by the structure of human visual pathways.
arXiv Detail & Related papers (2020-05-15T21:16:22Z) - BiDet: An Efficient Binarized Object Detector [96.19708396510894]
We propose a binarized neural network learning method called BiDet for efficient object detection.
Our BiDet fully utilizes the representational capacity of the binary neural networks for object detection by redundancy removal.
Our method outperforms the state-of-the-art binary neural networks by a sizable margin.
arXiv Detail & Related papers (2020-03-09T08:16:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.