Towards Unsupervised Eye-Region Segmentation for Eye Tracking
- URL: http://arxiv.org/abs/2410.06131v1
- Date: Tue, 8 Oct 2024 15:33:23 GMT
- Title: Towards Unsupervised Eye-Region Segmentation for Eye Tracking
- Authors: Jiangfan Deng, Zhuang Jia, Zhaoxue Wang, Xiang Long, Daniel K. Du,
- Abstract summary: We use priors of human eye and extract signals from the image to establish rough clues indicating the eye-region structure.
A segmentation network is trained to gradually identify the precise area for each part.
Experiments show that our unsupervised approach can easily achieve 90% (the pupil and iris) and 85% (the whole eye-region) of the performances under supervised learning.
- Score: 9.051786094550293
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Finding the eye and parsing out the parts (e.g. pupil and iris) is a key prerequisite for image-based eye tracking, which has become an indispensable module in today's head-mounted VR/AR devices. However, a typical route for training a segmenter requires tedious handlabeling. In this work, we explore an unsupervised way. First, we utilize priors of human eye and extract signals from the image to establish rough clues indicating the eye-region structure. Upon these sparse and noisy clues, a segmentation network is trained to gradually identify the precise area for each part. To achieve accurate parsing of the eye-region, we first leverage the pretrained foundation model Segment Anything (SAM) in an automatic way to refine the eye indications. Then, the learning process is designed in an end-to-end manner following progressive and prior-aware principle. Experiments show that our unsupervised approach can easily achieve 90% (the pupil and iris) and 85% (the whole eye-region) of the performances under supervised learning.
Related papers
- LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes.
We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net)
The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Collaborative Feature Learning for Fine-grained Facial Forgery Detection
and Segmentation [56.73855202368894]
Previous work related to forgery detection mostly focuses on the entire faces.
Recent forgery methods have developed to edit important facial components while maintaining others unchanged.
We propose a collaborative feature learning approach to simultaneously detect manipulation and segment the falsified components.
arXiv Detail & Related papers (2023-04-17T08:49:11Z) - A Simple Framework for Open-Vocabulary Segmentation and Detection [85.21641508535679]
We present OpenSeeD, a simple Open-vocabulary and Detection framework that jointly learns from different segmentation and detection datasets.
We first introduce a pre-trained text encoder to encode all the visual concepts in two tasks and learn a common semantic space for them.
After pre-training, our model exhibits competitive or stronger zero-shot transferability for both segmentation and detection.
arXiv Detail & Related papers (2023-03-14T17:58:34Z) - Multistream Gaze Estimation with Anatomical Eye Region Isolation by
Synthetic to Real Transfer Learning [24.872143206600185]
We propose a novel neural pipeline, MSGazeNet, that learns gaze representations by taking advantage of the eye anatomy information.
Our framework surpasses the state-of-the-art by 7.57% and 1.85% on three gaze estimation datasets.
arXiv Detail & Related papers (2022-06-18T17:57:32Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Gaze Estimation with Eye Region Segmentation and Self-Supervised
Multistream Learning [8.422257363944295]
We present a novel multistream network that learns robust eye representations for gaze estimation.
We first create a synthetic dataset containing eye region masks detailing the visible eyeball and iris using a simulator.
We then perform eye region segmentation with a U-Net type model which we later use to generate eye region masks for real-world images.
arXiv Detail & Related papers (2021-12-15T04:44:45Z) - Learning To Segment Dominant Object Motion From Watching Videos [72.57852930273256]
We envision a simple framework for dominant moving object segmentation that neither requires annotated data to train nor relies on saliency priors or pre-trained optical flow maps.
Inspired by a layered image representation, we introduce a technique to group pixel regions according to their affine parametric motion.
This enables our network to learn segmentation of the dominant foreground object using only RGB image pairs as input for both training and inference.
arXiv Detail & Related papers (2021-11-28T14:51:00Z) - EllSeg: An Ellipse Segmentation Framework for Robust Gaze Tracking [3.0448872422956432]
Ellipse fitting is an essential component in pupil or iris tracking based video oculography.
We propose training a convolutional neural network to directly segment entire elliptical structures.
arXiv Detail & Related papers (2020-07-19T06:13:01Z) - Manifold-driven Attention Maps for Weakly Supervised Segmentation [9.289524646688244]
We propose a manifold driven attention-based network to enhance visual salient regions.
Our method generates superior attention maps directly during inference without the need of extra computations.
arXiv Detail & Related papers (2020-04-07T00:03:28Z) - Regression and Learning with Pixel-wise Attention for Retinal Fundus
Glaucoma Segmentation and Detection [3.7687214264740994]
We present two deep learning-based automated algorithms for glaucoma detection and optic disc and cup segmentation.
We utilize the attention mechanism to learn pixel-wise features for accurate prediction.
arXiv Detail & Related papers (2020-01-06T23:54:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.