Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven
Neural Networks
- URL: http://arxiv.org/abs/2403.02909v1
- Date: Tue, 5 Mar 2024 12:18:12 GMT
- Title: Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven
Neural Networks
- Authors: Abeer Banerjee, Naval K. Mehta, Shyam S. Prasad, Himanshu, Sumeet
Saurav, Sanjay Singh
- Abstract summary: In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems.
Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture.
Our research underscores the potency of our neural network to work with temporally consecutive encoded images for precise gaze vector predictions in challenging low-light videos, contributing to the advancement of gaze prediction technologies.
- Score: 2.762909189433944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we address the intricate challenge of gaze vector prediction,
a pivotal task with applications ranging from human-computer interaction to
driver monitoring systems. Our innovative approach is designed for the
demanding setting of extremely low-light conditions, leveraging a novel
temporal event encoding scheme, and a dedicated neural network architecture.
The temporal encoding method seamlessly integrates Dynamic Vision Sensor (DVS)
events with grayscale guide frames, generating consecutively encoded images for
input into our neural network. This unique solution not only captures diverse
gaze responses from participants within the active age group but also
introduces a curated dataset tailored for low-light conditions. The encoded
temporal frames paired with our network showcase impressive spatial
localization and reliable gaze direction in their predictions. Achieving a
remarkable 100-pixel accuracy of 100%, our research underscores the potency of
our neural network to work with temporally consecutive encoded images for
precise gaze vector predictions in challenging low-light videos, contributing
to the advancement of gaze prediction technologies.
Related papers
- D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video [53.83936023443193]
This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures.
Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
arXiv Detail & Related papers (2024-06-14T14:35:44Z) - EvGNN: An Event-driven Graph Neural Network Accelerator for Edge Vision [0.06752396542927405]
Event-driven graph neural networks (GNNs) have emerged as a promising solution for sparse event-based vision.
We propose EvGNN, the first event-driven GNN accelerator for low-footprint, ultra-low-latency, and high-accuracy edge vision.
arXiv Detail & Related papers (2024-04-30T12:18:47Z) - Finding Visual Saliency in Continuous Spike Stream [23.591309376586835]
In this paper, we investigate the visual saliency in the continuous spike stream for the first time.
We propose a Recurrent Spiking Transformer framework, which is based on a full spiking neural network.
Our framework exhibits a substantial margin of improvement in highlighting and capturing visual saliency in the spike stream.
arXiv Detail & Related papers (2024-03-10T15:15:35Z) - Optical flow estimation from event-based cameras and spiking neural
networks [0.4899818550820575]
Event-based sensors are an excellent fit for Spiking Neural Networks (SNNs)
We propose a U-Net-like SNN which, after supervised training, is able to make dense optical flow estimations.
Thanks to separable convolutions, we have been able to develop a light model that can nonetheless yield reasonably accurate optical flow estimates.
arXiv Detail & Related papers (2023-02-13T16:17:54Z) - Convolutional Neural Generative Coding: Scaling Predictive Coding to
Natural Images [79.07468367923619]
We develop convolutional neural generative coding (Conv-NGC)
We implement a flexible neurobiologically-motivated algorithm that progressively refines latent state maps.
We study the effectiveness of our brain-inspired neural system on the tasks of reconstruction and image denoising.
arXiv Detail & Related papers (2022-11-22T06:42:41Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - Wide and Narrow: Video Prediction from Context and Motion [54.21624227408727]
We propose a new framework to integrate these complementary attributes to predict complex pixel dynamics through deep networks.
We present global context propagation networks that aggregate the non-local neighboring representations to preserve the contextual information over the past frames.
We also devise local filter memory networks that generate adaptive filter kernels by storing the motion of moving objects in the memory.
arXiv Detail & Related papers (2021-10-22T04:35:58Z) - Adversarial Attacks on Spiking Convolutional Networks for Event-based
Vision [0.6999740786886537]
We show how white-box adversarial attack algorithms can be adapted to the discrete and sparse nature of event-based visual data.
We also verify, for the first time, the effectiveness of these perturbations directly on neuromorphic hardware.
arXiv Detail & Related papers (2021-10-06T17:20:05Z) - Multivariate Time Series Classification Using Spiking Neural Networks [7.273181759304122]
Spiking neural network has drawn attention as it enables low power consumption.
We present an encoding scheme to convert time series into sparse spatial temporal spike patterns.
A training algorithm to classify spatial temporal patterns is also proposed.
arXiv Detail & Related papers (2020-07-07T15:24:01Z) - Event-based Asynchronous Sparse Convolutional Networks [54.094244806123235]
Event cameras are bio-inspired sensors that respond to per-pixel brightness changes in the form of asynchronous and sparse "events"
We present a general framework for converting models trained on synchronous image-like event representations into asynchronous models with identical output.
We show both theoretically and experimentally that this drastically reduces the computational complexity and latency of high-capacity, synchronous neural networks.
arXiv Detail & Related papers (2020-03-20T08:39:49Z) - TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
Real-time Video Facial Expression Recognition [93.0013343535411]
This study explores a novel deep time windowed convolutional neural network design (TimeConvNets) for the purpose of real-time video facial expression recognition.
We show that TimeConvNets can better capture the transient nuances of facial expressions and boost classification accuracy while maintaining a low inference time.
arXiv Detail & Related papers (2020-03-03T20:58:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.