Related papers: Digitally Prototype Your Eye Tracker: Simulating Hardware Performance using 3D Synthetic Data

Digitally Prototype Your Eye Tracker: Simulating Hardware Performance using 3D Synthetic Data

URL: http://arxiv.org/abs/2503.16742v1
Date: Thu, 20 Mar 2025 23:09:15 GMT
Title: Digitally Prototype Your Eye Tracker: Simulating Hardware Performance using 3D Synthetic Data
Authors: Esther Y. H. Lin, Yimin Ding, Jogendra Kundu, Yatong An, Mohamed T. El-Haddad, Alexander Fix,
Abstract summary: We propose a method for end-to-end evaluation of how hardware changes impact machine learning-based ET performance.<n>We use a dataset of real 3D eyes, reconstructed from light dome data using neural radiance fields (NeRF), to synthesize captured eyes from novel viewpoints.<n>We also present a first-of-its-kind analysis in which we vary ET camera positions, evaluating ET performance ranging from on-axis direct views of the eye to peripheral views on the frame.
Score: 38.620399293339034
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Eye tracking (ET) is a key enabler for Augmented and Virtual Reality (AR/VR). Prototyping new ET hardware requires assessing the impact of hardware choices on eye tracking performance. This task is compounded by the high cost of obtaining data from sufficiently many variations of real hardware, especially for machine learning, which requires large training datasets. We propose a method for end-to-end evaluation of how hardware changes impact machine learning-based ET performance using only synthetic data. We utilize a dataset of real 3D eyes, reconstructed from light dome data using neural radiance fields (NeRF), to synthesize captured eyes from novel viewpoints and camera parameters. Using this framework, we demonstrate that we can predict the relative performance across various hardware configurations, accounting for variations in sensor noise, illumination brightness, and optical blur. We also compare our simulator with the publicly available eye tracking dataset from the Project Aria glasses, demonstrating a strong correlation with real-world performance. Finally, we present a first-of-its-kind analysis in which we vary ET camera positions, evaluating ET performance ranging from on-axis direct views of the eye to peripheral views on the frame. Such an analysis would have previously required manufacturing physical devices to capture evaluation data. In short, our method enables faster prototyping of ET hardware.

Related papers

FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video [52.33896173943054]
Egocentric motion capture with a head-mounted body-facing stereo camera is crucial for VR and AR applications. Existing methods rely on synthetic pretraining and struggle to generate smooth and accurate predictions in real-world settings. We propose FRAME, a simple yet effective architecture that combines device pose and camera feeds for state-of-the-art body pose prediction.
arXiv Detail & Related papers (2025-03-29T14:26:06Z)
V-HOP: Visuo-Haptic 6D Object Pose Tracking [18.984396185797667]
Humans naturally integrate vision and haptics for robust object perception during manipulation.<n>Prior object pose estimation research has attempted to combine visual and haptic/tactile feedback.<n>We introduce a new visuo-haptic transformer-based object pose tracker that seamlessly integrates visual and haptic input.
arXiv Detail & Related papers (2025-02-24T18:59:50Z)
Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios [3.30184292168618]
We propose a dataset generation pipeline based on the CARLA simulator for 3D object detection on LiDAR point clouds.<n>We are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset.
arXiv Detail & Related papers (2025-02-20T22:27:42Z)
Synthetica: Large Scale Synthetic Data for Robot Perception [21.415878105900187]
We present Synthetica, a method for large-scale synthetic data generation for training robust state estimators. This paper focuses on the task of object detection, an important problem which can serve as the front-end for most state estimation problems. We leverage data from a ray-tracing, generating 2.7 million images, to train highly accurate real-time detection transformers. We demonstrate state-of-the-art performance on the task of object detection while having detectors that run at 50-100Hz which is 9 times faster than the prior SOTA.
arXiv Detail & Related papers (2024-10-28T15:50:56Z)
Real-Time Radiance Fields for Single-Image Portrait View Synthesis [85.32826349697972]
We present a one-shot method to infer and render a 3D representation from a single unposed image in real-time. Given a single RGB input, our image encoder directly predicts a canonical triplane representation of a neural radiance field for 3D-aware novel view synthesis via volume rendering. Our method is fast (24 fps) on consumer hardware, and produces higher quality results than strong GAN-inversion baselines that require test-time optimization.
arXiv Detail & Related papers (2023-05-03T17:56:01Z)
Hands-Up: Leveraging Synthetic Data for Hands-On-Wheel Detection [0.38233569758620045]
This work demonstrates the use of synthetic photo-realistic in-cabin data to train a Driver Monitoring System. We show how performing error analysis and generating the missing edge-cases in our platform boosts performance. This showcases the ability of human-centric synthetic data to generalize well to the real world.
arXiv Detail & Related papers (2022-05-31T23:34:12Z)
MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z)
Synthetic Data Are as Good as the Real for Association Knowledge Learning in Multi-object Tracking [19.772968520292345]
In this paper, we study whether 3D synthetic data can replace real-world videos for association training. Specifically, we introduce a large-scale synthetic data engine named MOTX, where the motion characteristics of cameras and objects are manually configured to be similar to those in real-world datasets. We show that compared with real data, association knowledge obtained from synthetic data can achieve very similar performance on real-world test sets without domain adaption techniques.
arXiv Detail & Related papers (2021-06-30T14:46:36Z)
Towards End-to-end Video-based Eye-Tracking [50.0630362419371]
Estimating eye-gaze from images alone is a challenging task due to un-observable person-specific factors. We propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. We demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures.
arXiv Detail & Related papers (2020-07-26T12:39:15Z)
SimAug: Learning Robust Representations from Simulation for Trajectory Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data. We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.