JaneEye: A 12-nm 2K-FPS 18.9-$μ$J/Frame Event-based Eye Tracking Accelerator
- URL: http://arxiv.org/abs/2510.01213v2
- Date: Thu, 06 Nov 2025 16:00:01 GMT
- Title: JaneEye: A 12-nm 2K-FPS 18.9-$μ$J/Frame Event-based Eye Tracking Accelerator
- Authors: Tao Han, Ang Li, Qinyu Chen, Chang Gao,
- Abstract summary: We present JaneEye, an energy-efficient event-based eye-tracking hardware accelerator for wearable devices.<n>Our proposed model achieves high accuracy with a pixel error of 2.45 on the 3ET+ dataset, using only 17.6K parameters, with up to 1250 Hz event frame rate.<n>Our 12-nm ASIC implementation operates at 400 MHz, delivering an end-to-end latency of 0.5 ms at an energy efficiency of 18.9 $mu$J/frame.
- Score: 12.18562859126189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Eye tracking has become a key technology for gaze-based interactions in Extended Reality (XR). However, conventional frame-based eye-tracking systems often fall short of XR's stringent requirements for high accuracy, low latency, and energy efficiency. Event cameras present a compelling alternative, offering ultra-high temporal resolution and low power consumption. In this paper, we present JaneEye, an energy-efficient event-based eye-tracking hardware accelerator designed specifically for wearable devices, leveraging sparse, high-temporal-resolution event data. We introduce an ultra-lightweight neural network architecture featuring a novel ConvJANET layer, which simplifies the traditional ConvLSTM by retaining only the forget gate, thereby halving computational complexity without sacrificing temporal modeling capability. Our proposed model achieves high accuracy with a pixel error of 2.45 on the 3ET+ dataset, using only 17.6K parameters, with up to 1250 Hz event frame rate. To further enhance hardware efficiency, we employ custom linear approximations of activation functions (hardsigmoid and hardtanh) and fixed-point quantization. Through software-hardware co-design, our 12-nm ASIC implementation operates at 400 MHz, delivering an end-to-end latency of 0.5 ms (equivalent to 2000 Frames Per Second (FPS)) at an energy efficiency of 18.9 $\mu$J/frame. JaneEye sets a new benchmark in low-power, high-performance eye-tracking solutions suitable for integration into next-generation XR wearables.
Related papers
- Neuromorphic Eye Tracking for Low-Latency Pupil Detection [37.091037454305024]
Eye tracking for wearable systems demands low latency and milliwatt-level power.<n>Neuromorphic sensors and spiking neural networks (SNNs) offer a promising alternative.<n>This paper presents a neuromorphic version of top-performing event-based eye-tracking models.
arXiv Detail & Related papers (2025-12-10T11:30:21Z) - Realizing Fully-Integrated, Low-Power, Event-Based Pupil Tracking with Neuromorphic Hardware [2.2940141855172036]
We present the first battery-powered, wearable pupil-center-tracking system with complete on-device integration.<n>Our solution features a novel uncertainty-quantifying spiking neural network with gated temporal decoding, optimized for strict memory and bandwidth constraints.<n>Our work demonstrates that end-to-end neuromorphic computing enables practical, always-on eye tracking for next-generation energy-efficient wearable systems.
arXiv Detail & Related papers (2025-11-25T10:58:23Z) - HOMI: Ultra-Fast EdgeAI platform for Event Cameras [1.9923531555025618]
Event cameras offer significant advantages for edge robotics applications due to their asynchronous operation and sparse, event-driven output.<n>We present an ultra-low latency, end-to-end edge AI platform comprising a Prophesee IMX636 event sensor chip with an Xilinx Zynq UltraScale+MPSoC FPGA chip.
arXiv Detail & Related papers (2025-08-18T05:47:48Z) - EA: An Event Autoencoder for High-Speed Vision Sensing [0.9401004127785267]
Event cameras offer a promising alternative but pose challenges in object detection due to sparse and noisy event streams.<n>We propose an event autoencoder architecture that efficiently compresses and reconstructs event data.<n>We show that our approach achieves comparable accuracy to the YOLO-v4 model while utilizing up to $35.5times$ fewer parameters.
arXiv Detail & Related papers (2025-07-09T00:21:15Z) - Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers [56.37495946212932]
Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs)
This work proposes Quasar-ViT, a hardware-oriented quantization-aware architecture search framework for ViTs.
arXiv Detail & Related papers (2024-07-25T16:35:46Z) - FAPNet: An Effective Frequency Adaptive Point-based Eye Tracker [0.6554326244334868]
Event cameras are an alternative to traditional cameras in the realm of eye tracking.
Existing event-based eye tracking networks neglect the pivotal sparse and fine-grained temporal information in events.
In this paper, we utilize Point Cloud as the event representation to harness the high temporal resolution and sparse characteristics of events in eye tracking tasks.
arXiv Detail & Related papers (2024-06-05T12:08:01Z) - Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge
Devices [90.30316433184414]
We propose a data-model-hardware tri-design framework for high- throughput, low-cost, and high-accuracy MOT on HD video stream.
Compared to the state-of-the-art MOT baseline, our tri-design approach can achieve 12.5x latency reduction, 20.9x effective frame rate improvement, 5.83x lower power, and 9.78x better energy efficiency, without much accuracy drop.
arXiv Detail & Related papers (2022-10-16T16:21:40Z) - LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy
Physics [45.666822327616046]
This work presents a novel reconfigurable architecture for Low Graph Neural Network (LL-GNN) designs for particle detectors.
The LL-GNN design advances the next generation of trigger systems by enabling sophisticated algorithms to process experimental data efficiently.
arXiv Detail & Related papers (2022-09-28T12:55:35Z) - EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense
Prediction [67.11722682878722]
This work presents EfficientViT, a new family of high-resolution vision models with novel multi-scale linear attention.
Our multi-scale linear attention achieves the global receptive field and multi-scale learning.
EfficientViT delivers remarkable performance gains over previous state-of-the-art models.
arXiv Detail & Related papers (2022-05-29T20:07:23Z) - Projected GANs Converge Faster [50.23237734403834]
Generative Adversarial Networks (GANs) produce high-quality images but are challenging to train.
We make significant headway on these issues by projecting generated and real samples into a fixed, pretrained feature space.
Our Projected GAN improves image quality, sample efficiency, and convergence speed.
arXiv Detail & Related papers (2021-11-01T15:11:01Z) - EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware
Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.