Saccade Mechanisms for Image Classification, Object Detection and
Tracking
- URL: http://arxiv.org/abs/2206.05102v1
- Date: Fri, 10 Jun 2022 13:50:34 GMT
- Title: Saccade Mechanisms for Image Classification, Object Detection and
Tracking
- Authors: Saurabh Farkya, Zachary Daniels, Aswin Nadamuni Raghavan, David Zhang,
and Michael Piacentino
- Abstract summary: We examine how the saccade mechanism from biological vision can be used to make deep neural networks more efficient for classification and object detection problems.
Our proposed approach is based on the ideas of attention-driven visual processing and saccades, miniature eye movements influenced by attention.
- Score: 12.751552698602744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We examine how the saccade mechanism from biological vision can be used to
make deep neural networks more efficient for classification and object
detection problems. Our proposed approach is based on the ideas of
attention-driven visual processing and saccades, miniature eye movements
influenced by attention. We conduct experiments by analyzing: i) the robustness
of different deep neural network (DNN) feature extractors to partially-sensed
images for image classification and object detection, and ii) the utility of
saccades in masking image patches for image classification and object tracking.
Experiments with convolutional nets (ResNet-18) and transformer-based models
(ViT, DETR, TransTrack) are conducted on several datasets (CIFAR-10, DAVSOD,
MSCOCO, and MOT17). Our experiments show intelligent data reduction via
learning to mimic human saccades when used in conjunction with state-of-the-art
DNNs for classification, detection, and tracking tasks. We observed minimal
drop in performance for the classification and detection tasks while only using
about 30\% of the original sensor data. We discuss how the saccade mechanism
can inform hardware design via ``in-pixel'' processing.
Related papers
- Visual Context-Aware Person Fall Detection [52.49277799455569]
We present a segmentation pipeline to semi-automatically separate individuals and objects in images.
Background objects such as beds, chairs, or wheelchairs can challenge fall detection systems, leading to false positive alarms.
We demonstrate that object-specific contextual transformations during training effectively mitigate this challenge.
arXiv Detail & Related papers (2024-04-11T19:06:36Z) - Image complexity based fMRI-BOLD visual network categorization across
visual datasets using topological descriptors and deep-hybrid learning [3.522950356329991]
The aim of this study is to examine how network topology differs in response to distinct visual stimuli from visual datasets.
To achieve this, 0- and 1-dimensional persistence diagrams are computed for each visual network representing COCO, ImageNet, and SUN.
The extracted K-means cluster features are fed to a novel deep-hybrid model that yields accuracy in the range of 90%-95% in classifying these visual networks.
arXiv Detail & Related papers (2023-11-03T14:05:57Z) - Forged Image Detection using SOTA Image Classification Deep Learning
Methods for Image Forensics with Error Level Analysis [2.719418335747252]
Image forensics is one of the major areas of computer vision application.
Forgery of images is sub-category of image forensics and can be detected using Error Level Analysis.
We perform transfer learning with state-of-the-art image classification models over error level analysis induced CASIA ITDE v.2 dataset.
arXiv Detail & Related papers (2022-11-28T10:10:42Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - FuNNscope: Visual microscope for interactively exploring the loss
landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks.
We generalize observations on small neural networks to more complex systems.
An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z) - Vision Transformer with Convolutions Architecture Search [72.70461709267497]
We propose an architecture search method-Vision Transformer with Convolutions Architecture Search (VTCAS)
The high-performance backbone network searched by VTCAS introduces the desirable features of convolutional neural networks into the Transformer architecture.
It enhances the robustness of the neural network for object recognition, especially in the low illumination indoor scene.
arXiv Detail & Related papers (2022-03-20T02:59:51Z) - Hybrid Optimized Deep Convolution Neural Network based Learning Model
for Object Detection [0.0]
Object identification is one of the most fundamental and difficult issues in computer vision.
In recent years, deep learning-based object detection techniques have grabbed the public's interest.
In this study, a unique deep learning classification technique is used to create an autonomous object detecting system.
The suggested framework has a detection accuracy of 0.9864, which is greater than current techniques.
arXiv Detail & Related papers (2022-03-02T04:39:37Z) - Robust Region Feature Synthesizer for Zero-Shot Object Detection [87.79902339984142]
We build a novel zero-shot object detection framework that contains an Intra-class Semantic Diverging component and an Inter-class Structure Preserving component.
It is the first study to carry out zero-shot object detection in remote sensing imagery.
arXiv Detail & Related papers (2022-01-01T03:09:15Z) - Distractor-Aware Neuron Intrinsic Learning for Generic 2D Medical Image
Classifications [30.62607811479386]
We observe that the convolutional neural networks (CNNs) are vulnerable to distractor interference.
In this paper, we explore distractors from the CNN feature space via proposing a neuron intrinsic learning method.
The proposed method performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-20T09:59:04Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.