Related papers: Learning Gaze-aware Compositional GAN

Learning Gaze-aware Compositional GAN

URL: http://arxiv.org/abs/2405.20643v1
Date: Fri, 31 May 2024 07:07:54 GMT
Title: Learning Gaze-aware Compositional GAN
Authors: Nerea Aranjuelo, Siyu Huang, Ignacio Arganda-Carreras, Luis Unzueta, Oihana Otaegui, Hanspeter Pfister, Donglai Wei,
Abstract summary: We present a generative framework to create annotated gaze data by leveraging the benefits of labeled and unlabeled data sources. We show our approach's effectiveness in generating within-domain image augmentations in the ETH-XGaze dataset and cross-domain augmentations in the CelebAMask-HQ dataset domain for gaze estimation training.
Score: 30.714854907472333
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Gaze-annotated facial data is crucial for training deep neural networks (DNNs) for gaze estimation. However, obtaining these data is labor-intensive and requires specialized equipment due to the challenge of accurately annotating the gaze direction of a subject. In this work, we present a generative framework to create annotated gaze data by leveraging the benefits of labeled and unlabeled data sources. We propose a Gaze-aware Compositional GAN that learns to generate annotated facial images from a limited labeled dataset. Then we transfer this model to an unlabeled data domain to take advantage of the diversity it provides. Experiments demonstrate our approach's effectiveness in generating within-domain image augmentations in the ETH-XGaze dataset and cross-domain augmentations in the CelebAMask-HQ dataset domain for gaze estimation DNN training. We also show additional applications of our work, which include facial image editing and gaze redirection.

Related papers

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels [10.827081942898506]
We introduce a novel Self-Training Weakly-Supervised Gaze Estimation framework (ST-WSGE) We propose the Gaze Transformer (GaT), a modality-agnostic architecture capable of simultaneously learning static and dynamic gaze information from both image and video datasets. By combining 3D video datasets with 2D gaze target labels from gaze following tasks, our approach achieves the following key contributions.
arXiv Detail & Related papers (2025-02-27T16:35:25Z)
SpecDM: Hyperspectral Dataset Synthesis with Pixel-level Semantic Annotations [27.391859339238906]
In this paper, we explore the potential of generative diffusion model in synthesizing hyperspectral images with pixel-level annotations. To the best of our knowledge, it is the first work to generate high-dimensional HSIs with annotations. We select two of the most widely used dense prediction tasks: semantic segmentation and change detection, and generate datasets suitable for these tasks.
arXiv Detail & Related papers (2025-02-24T11:13:37Z)
SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks. To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z)
Gaze-guided Hand-Object Interaction Synthesis: Dataset and Method [63.49140028965778]
We present GazeHOI, the first dataset to capture simultaneous 3D modeling of gaze, hand, and object interactions. To tackle these issues, we propose a stacked gaze-guided hand-object interaction diffusion model, named GHO-Diffusion. We also introduce HOI-Manifold Guidance during the sampling stage of GHO-Diffusion, enabling fine-grained control over generated motions.
arXiv Detail & Related papers (2024-03-24T14:24:13Z)
Position: Graph Foundation Models are Already Here [53.737868336014735]
Graph Foundation Models (GFMs) are emerging as a significant research topic in the graph domain. We propose a novel perspective for the GFM development by advocating for a graph vocabulary'' This perspective can potentially advance the future GFM design in line with the neural scaling laws.
arXiv Detail & Related papers (2024-02-03T17:24:36Z)
Semi-Synthetic Dataset Augmentation for Application-Specific Gaze Estimation [0.3683202928838613]
We show how to generate a tridimensional mesh of the face and render the training images from a virtual camera at a specific position and orientation related to the application. This leads to an average 47% decrease in gaze estimation angular error.
arXiv Detail & Related papers (2023-10-27T20:27:22Z)
NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation [37.977032771941715]
We propose a novel Head-Eye redirection parametric model based on Neural Radiance Field. Our model can decouple the face and eyes for separate neural rendering. It can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction.
arXiv Detail & Related papers (2022-12-30T13:52:28Z)
LatentGaze: Cross-Domain Gaze Estimation through Gaze-Aware Analytic Latent Code Manipulation [0.0]
We propose a gaze-aware analytic manipulation method, based on a data-driven approach with generative adversarial network inversion's disentanglement characteristics. By utilizing GAN-based encoder-generator process, we shift the input image from the target domain to the source domain image, which a gaze estimator is sufficiently aware.
arXiv Detail & Related papers (2022-09-21T08:05:53Z)
RAZE: Region Guided Self-Supervised Gaze Representation Learning [5.919214040221055]
RAZE is a Region guided self-supervised gAZE representation learning framework which leverage from non-annotated facial image data. Ize-Net is a capsule layer based CNN architecture which can efficiently capture rich eye representation.
arXiv Detail & Related papers (2022-08-04T06:23:49Z)
CUDA-GR: Controllable Unsupervised Domain Adaptation for Gaze Redirection [3.0141238193080295]
The aim of gaze redirection is to manipulate the gaze in an image to the desired direction. Advancement in generative adversarial networks has shown excellent results in generating photo-realistic images. To enable such fine-tuned control, one needs to obtain ground truth annotations for the training data which can be very expensive.
arXiv Detail & Related papers (2021-06-21T04:39:42Z)
Self-Learning Transformations for Improving Gaze and Head Redirection [49.61091281780071]
We propose a novel generative model for images of faces, that is capable of producing high-quality images under fine-grained control over eye gaze and head orientation angles. This requires the disentangling of many appearance related factors including gaze and head orientation but also lighting, hue etc. We show that explicitly disentangling task-irrelevant factors results in more accurate modelling of gaze and head orientation.
arXiv Detail & Related papers (2020-10-23T11:18:37Z)
Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild [82.42401132933462]
We present a solution that works without the need for precise annotations of the gaze angle and the head pose. Our method consists of three novel modules: the Gaze Correction module (GCM), the Gaze Animation module (GAM), and the Pretrained Autoencoder module (PAM)
arXiv Detail & Related papers (2020-08-09T23:14:16Z)
On Leveraging Pretrained GANs for Generation with Limited Data [83.32972353800633]
generative adversarial networks (GANs) can generate highly realistic images, that are often indistinguishable (by humans) from real images. Most images so generated are not contained in a training dataset, suggesting potential for augmenting training sets with GAN-generated data. We leverage existing GAN models pretrained on large-scale datasets to introduce additional knowledge, following the concept of transfer learning. An extensive set of experiments is presented to demonstrate the effectiveness of the proposed techniques on generation with limited data.
arXiv Detail & Related papers (2020-02-26T21:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.