Exploring Temporal Dynamics in Event-based Eye Tracker
- URL: http://arxiv.org/abs/2503.23725v1
- Date: Mon, 31 Mar 2025 04:57:13 GMT
- Title: Exploring Temporal Dynamics in Event-based Eye Tracker
- Authors: Hongwei Ren, Xiaopeng Lin, Hongxiang Huang, Yue Zhou, Bojun Cheng,
- Abstract summary: Eye-tracking is a vital technology for human-computer interaction, especially in wearable devices such as AR, VR, and XR.<n>The realization of high-speed and high-precision eye-tracking using frame-based image sensors is constrained by their limited temporal resolution.<n>We propose TDTracker, an effective eye-tracking framework that captures rapid eye movements by thoroughly modeling temporal dynamics.
- Score: 3.3325719644030016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Eye-tracking is a vital technology for human-computer interaction, especially in wearable devices such as AR, VR, and XR. The realization of high-speed and high-precision eye-tracking using frame-based image sensors is constrained by their limited temporal resolution, which impairs the accurate capture of rapid ocular dynamics, such as saccades and blinks. Event cameras, inspired by biological vision systems, are capable of perceiving eye movements with extremely low power consumption and ultra-high temporal resolution. This makes them a promising solution for achieving high-speed, high-precision tracking with rich temporal dynamics. In this paper, we propose TDTracker, an effective eye-tracking framework that captures rapid eye movements by thoroughly modeling temporal dynamics from both implicit and explicit perspectives. TDTracker utilizes 3D convolutional neural networks to capture implicit short-term temporal dynamics and employs a cascaded structure consisting of a Frequency-aware Module, GRU, and Mamba to extract explicit long-term temporal dynamics. Ultimately, a prediction heatmap is used for eye coordinate regression. Experimental results demonstrate that TDTracker achieves state-of-the-art (SOTA) performance on the synthetic SEET dataset and secured Third place in the CVPR event-based eye-tracking challenge 2025. Our code is available at https://github.com/rhwxmx/TDTracker.
Related papers
- LiteTracker: Leveraging Temporal Causality for Accurate Low-latency Tissue Tracking [84.52765560227917]
LiteTracker is a low-latency method for tissue tracking in endoscopic video streams.
LiteTracker builds on a state-of-the-art long-term point tracking method, and introduces a set of training-free runtime optimizations.
arXiv Detail & Related papers (2025-04-14T05:53:57Z) - TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking [10.458676835674847]
Event-based cameras offer a biologically-inspired solution to this by capturing only changes in intensity levels at exceptionally high temporal resolution and low power consumption.<n>We propose TOFFE, a lightweight hybrid framework for performing event-based object motion estimation.
arXiv Detail & Related papers (2025-01-21T20:20:34Z) - EyeTrAES: Fine-grained, Low-Latency Eye Tracking via Adaptive Event Slicing [2.9795443606634917]
EyeTrAES is a novel approach using neuromorphic event cameras for high-fidelity tracking of natural pupillary movement.
We show that EyeTrAES boosts pupil tracking fidelity by 6+%, achieving IoU=92%, while incurring at least 3x lower latency than competing pure event-based eye tracking alternatives.
For robust user authentication, we train a lightweight per-user Random Forest classifier using a novel feature vector of short-term pupillary kinematics.
arXiv Detail & Related papers (2024-09-27T15:06:05Z) - FAPNet: An Effective Frequency Adaptive Point-based Eye Tracker [0.6554326244334868]
Event cameras are an alternative to traditional cameras in the realm of eye tracking.
Existing event-based eye tracking networks neglect the pivotal sparse and fine-grained temporal information in events.
In this paper, we utilize Point Cloud as the event representation to harness the high temporal resolution and sparse characteristics of events in eye tracking tasks.
arXiv Detail & Related papers (2024-06-05T12:08:01Z) - Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.
DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.
Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search [64.28335667655129]
Multiple object tracking is a critical task in autonomous driving.
As tracking accuracy improves, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency.
In this paper, we explore the use of the neural architecture search (NAS) methods to search for efficient architectures for tracking, aiming for low real-time latency while maintaining relatively high accuracy.
arXiv Detail & Related papers (2024-03-23T04:18:49Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Recurrent Vision Transformers for Object Detection with Event Cameras [62.27246562304705]
We present Recurrent Vision Transformers (RVTs), a novel backbone for object detection with event cameras.
RVTs can be trained from scratch to reach state-of-the-art performance on event-based object detection.
Our study brings new insights into effective design choices that can be fruitful for research beyond event-based vision.
arXiv Detail & Related papers (2022-12-11T20:28:59Z) - PUCK: Parallel Surface and Convolution-kernel Tracking for Event-Based
Cameras [4.110120522045467]
Event-cameras can guarantee fast visual sensing in dynamic environments, but require a tracking algorithm that can keep up with the high data rate induced by the robot ego-motion.
We introduce a novel tracking method that leverages the Exponential Reduced Ordinal Surface (EROS) data representation to decouple event-by-event processing and tracking.
We propose the task of tracking the air hockey puck sliding on a surface, with the future aim of controlling the iCub robot to reach the target precisely and on time.
arXiv Detail & Related papers (2022-05-16T13:23:52Z) - Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT
Philosophy [63.91005999481061]
A practical long-term tracker typically contains three key properties, i.e. an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism.
We propose a two-task tracking frame work (named DMTrack) to achieve distractor-aware fast tracking via Dynamic convolutions (d-convs) and Multiple object tracking (MOT) philosophy.
Our tracker achieves state-of-the-art performance on the LaSOT, OxUvA, TLP, VOT2018LT and VOT 2019LT benchmarks and runs in real-time (3x faster
arXiv Detail & Related papers (2021-04-25T00:59:53Z) - Real-Time Face & Eye Tracking and Blink Detection using Event Cameras [3.842206880015537]
Event cameras contain emerging, neuromorphic vision sensors that capture local light intensity changes at each pixel, generating a stream of asynchronous events.
Driver monitoring systems (DMS) are in-cabin safety systems designed to sense and understand a drivers physical and cognitive state.
This paper proposes a novel method to simultaneously detect and track faces and eyes for driver monitoring.
arXiv Detail & Related papers (2020-10-16T10:02:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.