Event Based, Near Eye Gaze Tracking Beyond 10,000Hz
- URL: http://arxiv.org/abs/2004.03577v3
- Date: Mon, 8 Aug 2022 16:52:15 GMT
- Title: Event Based, Near Eye Gaze Tracking Beyond 10,000Hz
- Authors: Anastasios N. Angelopoulos, Julien N.P. Martel, Amit P.S. Kohli, Jorg
Conradt, Gordon Wetzstein
- Abstract summary: We propose a hybrid frame-event-based near-eye gaze tracking system with update rates beyond 10,000 Hz.
Our system builds on emerging event cameras that simultaneously acquire regularly sampled frames and adaptively sampled events.
We hope to enable a new generation of ultra-low-latency gaze-contingent rendering and display techniques for virtual and augmented reality.
- Score: 41.23347304960948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The cameras in modern gaze-tracking systems suffer from fundamental bandwidth
and power limitations, constraining data acquisition speed to 300 Hz
realistically. This obstructs the use of mobile eye trackers to perform, e.g.,
low latency predictive rendering, or to study quick and subtle eye motions like
microsaccades using head-mounted devices in the wild. Here, we propose a hybrid
frame-event-based near-eye gaze tracking system offering update rates beyond
10,000 Hz with an accuracy that matches that of high-end desktop-mounted
commercial trackers when evaluated in the same conditions. Our system builds on
emerging event cameras that simultaneously acquire regularly sampled frames and
adaptively sampled events. We develop an online 2D pupil fitting method that
updates a parametric model every one or few events. Moreover, we propose a
polynomial regressor for estimating the point of gaze from the parametric pupil
model in real time. Using the first event-based gaze dataset, available at
https://github.com/aangelopoulos/event_based_gaze_tracking , we demonstrate
that our system achieves accuracies of 0.45 degrees--1.75 degrees for fields of
view from 45 degrees to 98 degrees. With this technology, we hope to enable a
new generation of ultra-low-latency gaze-contingent rendering and display
techniques for virtual and augmented reality.
Related papers
- EyeTrAES: Fine-grained, Low-Latency Eye Tracking via Adaptive Event Slicing [2.9795443606634917]
EyeTrAES is a novel approach using neuromorphic event cameras for high-fidelity tracking of natural pupillary movement.
We show that EyeTrAES boosts pupil tracking fidelity by 6+%, achieving IoU=92%, while incurring at least 3x lower latency than competing pure event-based eye tracking alternatives.
For robust user authentication, we train a lightweight per-user Random Forest classifier using a novel feature vector of short-term pupillary kinematics.
arXiv Detail & Related papers (2024-09-27T15:06:05Z) - On the Generation of a Synthetic Event-Based Vision Dataset for
Navigation and Landing [69.34740063574921]
This paper presents a methodology for generating event-based vision datasets from optimal landing trajectories.
We construct sequences of photorealistic images of the lunar surface with the Planet and Asteroid Natural Scene Generation Utility.
We demonstrate that the pipeline can generate realistic event-based representations of surface features by constructing a dataset of 500 trajectories.
arXiv Detail & Related papers (2023-08-01T09:14:20Z) - EV-Catcher: High-Speed Object Catching Using Low-latency Event-based
Neural Networks [107.62975594230687]
We demonstrate an application where event cameras excel: accurately estimating the impact location of fast-moving objects.
We introduce a lightweight event representation called Binary Event History Image (BEHI) to encode event data at low latency.
We show that the system is capable of achieving a success rate of 81% in catching balls targeted at different locations, with a velocity of up to 13 m/s even on compute-constrained embedded platforms.
arXiv Detail & Related papers (2023-04-14T15:23:28Z) - A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for
Mobile Devices [3.4836209951879957]
We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices.
Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side.
Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
arXiv Detail & Related papers (2022-10-22T15:26:50Z) - Event-Based high-speed low-latency fiducial marker tracking [15.052022635853799]
We propose an end-to-end pipeline for real-time, low latency, 6 degrees-of-freedom pose estimation of fiducial markers.
We employ the high-speed abilities of event-based sensors to directly refine the spatial transformation.
This approach allows us to achieve pose estimation at a rate up to 156kHz, while only relying on CPU resources.
arXiv Detail & Related papers (2021-10-12T08:34:31Z) - TUM-VIE: The TUM Stereo Visual-Inertial Event Dataset [50.8779574716494]
Event cameras are bio-inspired vision sensors which measure per pixel brightness changes.
They offer numerous benefits over traditional, frame-based cameras, including low latency, high dynamic range, high temporal resolution and low power consumption.
To foster the development of 3D perception and navigation algorithms with event cameras, we present the TUM-VIE dataset.
arXiv Detail & Related papers (2021-08-16T19:53:56Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Towards End-to-end Video-based Eye-Tracking [50.0630362419371]
Estimating eye-gaze from images alone is a challenging task due to un-observable person-specific factors.
We propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships.
We demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures.
arXiv Detail & Related papers (2020-07-26T12:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.