EyeBAG: Accurate Control of Eye Blink and Gaze Based on Data
Augmentation Leveraging Style Mixing
- URL: http://arxiv.org/abs/2306.17391v1
- Date: Fri, 30 Jun 2023 03:49:23 GMT
- Title: EyeBAG: Accurate Control of Eye Blink and Gaze Based on Data
Augmentation Leveraging Style Mixing
- Authors: Bryan S. Kim, Jeong Young Jeong, Wonjong Ryu
- Abstract summary: We introduce a novel framework consisting of two distinct modules: a blink control module and a gaze redirection module.
We show that our framework produces eye-controlled images of high quality, and demonstrate how it can be used to improve the performance of downstream tasks.
- Score: 0.483420384410068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent developments in generative models have enabled the generation of
photo-realistic human face images, and downstream tasks utilizing face
generation technology have advanced accordingly. However, models for downstream
tasks are yet substandard at eye control (e.g. eye blink, gaze redirection). To
overcome such eye control problems, we introduce a novel framework consisting
of two distinct modules: a blink control module and a gaze redirection module.
We also propose a novel data augmentation method to train each module,
leveraging style mixing to obtain images with desired features. We show that
our framework produces eye-controlled images of high quality, and demonstrate
how it can be used to improve the performance of downstream tasks.
Related papers
- CAR: Controllable Autoregressive Modeling for Visual Generation [100.33455832783416]
Controllable AutoRegressive Modeling (CAR) is a novel, plug-and-play framework that integrates conditional control into multi-scale latent variable modeling.
CAR progressively refines and captures control representations, which are injected into each autoregressive step of the pre-trained model to guide the generation process.
Our approach demonstrates excellent controllability across various types of conditions and delivers higher image quality compared to previous methods.
arXiv Detail & Related papers (2024-10-07T00:55:42Z) - ControlVAR: Exploring Controllable Visual Autoregressive Modeling [48.66209303617063]
Conditional visual generation has witnessed remarkable progress with the advent of diffusion models (DMs)
Challenges such as expensive computational cost, high inference latency, and difficulties of integration with large language models (LLMs) have necessitated exploring alternatives to DMs.
This paper introduces Controlmore, a novel framework that explores pixel-level controls in visual autoregressive modeling for flexible and efficient conditional generation.
arXiv Detail & Related papers (2024-06-14T06:35:33Z) - Referee Can Play: An Alternative Approach to Conditional Generation via
Model Inversion [35.21106030549071]
Diffusion Probabilistic Models (DPMs) are dominant force in text-to-image generation tasks.
We propose an alternative view of state-of-the-art DPMs as a way of inverting advanced Vision-Language Models (VLMs)
By directly optimizing images with the supervision of discriminative VLMs, the proposed method can potentially achieve a better text-image alignment.
arXiv Detail & Related papers (2024-02-26T05:08:40Z) - Cross-View Panorama Image Synthesis [68.35351563852335]
PanoGAN is a novel adversarial feedback GAN framework named.
PanoGAN enables high-quality panorama image generation with more convincing details than state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-22T15:59:44Z) - Self-Learning Transformations for Improving Gaze and Head Redirection [49.61091281780071]
We propose a novel generative model for images of faces, that is capable of producing high-quality images under fine-grained control over eye gaze and head orientation angles.
This requires the disentangling of many appearance related factors including gaze and head orientation but also lighting, hue etc.
We show that explicitly disentangling task-irrelevant factors results in more accurate modelling of gaze and head orientation.
arXiv Detail & Related papers (2020-10-23T11:18:37Z) - Towards a Neural Graphics Pipeline for Controllable Image Generation [96.11791992084551]
We present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models.
NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation.
We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes.
arXiv Detail & Related papers (2020-06-18T14:22:54Z) - Towards Coding for Human and Machine Vision: A Scalable Image Coding
Approach [104.02201472370801]
We come up with a novel image coding framework by leveraging both the compressive and the generative models.
By introducing advanced generative models, we train a flexible network to reconstruct images from compact feature representations and the reference pixels.
Experimental results demonstrate the superiority of our framework in both human visual quality and facial landmark detection.
arXiv Detail & Related papers (2020-01-09T10:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.