Text and Image Guided 3D Avatar Generation and Manipulation
- URL: http://arxiv.org/abs/2202.06079v1
- Date: Sat, 12 Feb 2022 14:37:29 GMT
- Title: Text and Image Guided 3D Avatar Generation and Manipulation
- Authors: Zehranaz Canfes, M. Furkan Atasoy, Alara Dirik, Pinar Yanardag
- Abstract summary: We propose a novel 3D manipulation method that can manipulate both the shape and texture of the model using text or image-based prompts such as 'a young face' or 'a surprised face'
Our method requires only 5 minutes per manipulation, and we demonstrate the effectiveness of our approach with extensive results and comparisons.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The manipulation of latent space has recently become an interesting topic in
the field of generative models. Recent research shows that latent directions
can be used to manipulate images towards certain attributes. However,
controlling the generation process of 3D generative models remains a challenge.
In this work, we propose a novel 3D manipulation method that can manipulate
both the shape and texture of the model using text or image-based prompts such
as 'a young face' or 'a surprised face'. We leverage the power of Contrastive
Language-Image Pre-training (CLIP) model and a pre-trained 3D GAN model
designed to generate face avatars, and create a fully differentiable rendering
pipeline to manipulate meshes. More specifically, our method takes an input
latent code and modifies it such that the target attribute specified by a text
or image prompt is present or enhanced, while leaving other attributes largely
unaffected. Our method requires only 5 minutes per manipulation, and we
demonstrate the effectiveness of our approach with extensive results and
comparisons.
Related papers
- Revealing Directions for Text-guided 3D Face Editing [52.85632020601518]
3D face editing is a significant task in multimedia, aimed at the manipulation of 3D face models across various control signals.
We present Face Clan, a text-general approach for generating and manipulating 3D faces based on arbitrary attribute descriptions.
Our method offers a precisely controllable manipulation method, allowing users to intuitively customize regions of interest with the text description.
arXiv Detail & Related papers (2024-10-07T12:04:39Z) - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - XAGen: 3D Expressive Human Avatars Generation [76.69560679209171]
XAGen is the first 3D generative model for human avatars capable of expressive control over body, face, and hands.
We propose a multi-part rendering technique that disentangles the synthesis of body, face, and hands.
Experiments show that XAGen surpasses state-of-the-art methods in terms of realism, diversity, and expressive control abilities.
arXiv Detail & Related papers (2023-11-22T18:30:42Z) - Articulated 3D Head Avatar Generation using Text-to-Image Diffusion
Models [107.84324544272481]
The ability to generate diverse 3D articulated head avatars is vital to a plethora of applications, including augmented reality, cinematography, and education.
Recent work on text-guided 3D object generation has shown great promise in addressing these needs.
We show that our diffusion-based articulated head avatars outperform state-of-the-art approaches for this task.
arXiv Detail & Related papers (2023-07-10T19:15:32Z) - Single-Shot Implicit Morphable Faces with Consistent Texture
Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.
Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z) - ClipFace: Text-guided Editing of Textured 3D Morphable Models [33.83015491013442]
We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces.
We employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces.
Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model.
arXiv Detail & Related papers (2022-12-02T19:01:08Z) - Controllable Face Manipulation and UV Map Generation by Self-supervised
Learning [20.10160338724354]
Recent methods achieve explicit control over 2D images by combining 2D generative model and 3DMM.
Due to the lack of realism and clarity in texture reconstruction by 3DMM, there is a domain gap between the synthetic image and the rendered image of 3DMM.
In this study, we propose to explicitly edit the latent space of the pretrained StyleGAN by controlling the parameters of the 3DMM.
arXiv Detail & Related papers (2022-09-24T16:49:25Z) - Controllable 3D Generative Adversarial Face Model via Disentangling
Shape and Appearance [63.13801759915835]
3D face modeling has been an active area of research in computer vision and computer graphics.
This paper proposes a new 3D face generative model that can decouple identity and expression.
arXiv Detail & Related papers (2022-08-30T13:40:48Z) - Text to Mesh Without 3D Supervision Using Limit Subdivision [13.358081015190255]
We present a technique for zero-shot generation of a 3D model using only a target text prompt.
We rely on a pre-trained CLIP model that compares the input text prompt with differentiably rendered images of our 3D model.
arXiv Detail & Related papers (2022-03-24T20:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.