RIT-Eyes: Rendering of near-eye images for eye-tracking applications
- URL: http://arxiv.org/abs/2006.03642v1
- Date: Fri, 5 Jun 2020 19:18:50 GMT
- Title: RIT-Eyes: Rendering of near-eye images for eye-tracking applications
- Authors: Nitinraj Nair, Rakshit Kothari, Aayush K. Chaudhary, Zhizhuo Yang,
Gabriel J. Diaz, Jeff B. Pelz, Reynold J. Bailey
- Abstract summary: Deep neural networks for video-based eye tracking have demonstrated resilience to noisy environments, stray reflections, and low resolution.
To train these networks, a large number of manually annotated images are required.
We introduce a synthetic eye image generation platform that improves upon previous work by adding features such as an active deformable iris, an aspherical cornea, retinal retro-reflection, gaze-coordinated eye-lid deformations, and blinks.
- Score: 3.4481343795011226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks for video-based eye tracking have demonstrated
resilience to noisy environments, stray reflections, and low resolution.
However, to train these networks, a large number of manually annotated images
are required. To alleviate the cumbersome process of manual labeling, computer
graphics rendering is employed to automatically generate a large corpus of
annotated eye images under various conditions. In this work, we introduce a
synthetic eye image generation platform that improves upon previous work by
adding features such as an active deformable iris, an aspherical cornea,
retinal retro-reflection, gaze-coordinated eye-lid deformations, and blinks. To
demonstrate the utility of our platform, we render images reflecting the
represented gaze distributions inherent in two publicly available datasets,
NVGaze and OpenEDS. We also report on the performance of two semantic
segmentation architectures (SegNet and RITnet) trained on rendered images and
tested on the original datasets.
Related papers
- Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation.
Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack.
General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors.
We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - Cross-view Self-localization from Synthesized Scene-graphs [1.9580473532948401]
Cross-view self-localization is a challenging scenario of visual place recognition in which database images are provided from sparse viewpoints.
We propose a new hybrid scene model that combines the advantages of view-invariant appearance features computed from raw images and view-dependent spatial-semantic features computed from synthesized images.
arXiv Detail & Related papers (2023-10-24T04:16:27Z) - Gaze Estimation with Eye Region Segmentation and Self-Supervised
Multistream Learning [8.422257363944295]
We present a novel multistream network that learns robust eye representations for gaze estimation.
We first create a synthetic dataset containing eye region masks detailing the visible eyeball and iris using a simulator.
We then perform eye region segmentation with a U-Net type model which we later use to generate eye region masks for real-world images.
arXiv Detail & Related papers (2021-12-15T04:44:45Z) - Semantic-Guided Zero-Shot Learning for Low-Light Image/Video Enhancement [3.4722706398428493]
Low-light images challenge both human perceptions and computer vision algorithms.
It is crucial to make algorithms robust to enlighten low-light images for computational photography and computer vision applications.
This paper proposes a semantic-guided zero-shot low-light enhancement network which is trained in the absence of paired images.
arXiv Detail & Related papers (2021-10-03T10:07:36Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z) - Towards Coding for Human and Machine Vision: A Scalable Image Coding
Approach [104.02201472370801]
We come up with a novel image coding framework by leveraging both the compressive and the generative models.
By introducing advanced generative models, we train a flexible network to reconstruct images from compact feature representations and the reference pixels.
Experimental results demonstrate the superiority of our framework in both human visual quality and facial landmark detection.
arXiv Detail & Related papers (2020-01-09T10:37:17Z) - Scene Text Synthesis for Efficient and Effective Deep Network Training [62.631176120557136]
We develop an innovative image synthesis technique that composes annotated training images by embedding foreground objects of interest into background images.
The proposed technique consists of two key components that in principle boost the usefulness of the synthesized images in deep network training.
Experiments over a number of public datasets demonstrate the effectiveness of our proposed image synthesis technique.
arXiv Detail & Related papers (2019-01-26T10:15:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.