ClipFace: Text-guided Editing of Textured 3D Morphable Models
- URL: http://arxiv.org/abs/2212.01406v2
- Date: Mon, 24 Apr 2023 11:51:40 GMT
- Title: ClipFace: Text-guided Editing of Textured 3D Morphable Models
- Authors: Shivangi Aneja, Justus Thies, Angela Dai, Matthias Nie{\ss}ner
- Abstract summary: We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces.
We employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces.
Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model.
- Score: 33.83015491013442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose ClipFace, a novel self-supervised approach for text-guided editing
of textured 3D morphable model of faces. Specifically, we employ user-friendly
language prompts to enable control of the expressions as well as appearance of
3D faces. We leverage the geometric expressiveness of 3D morphable models,
which inherently possess limited controllability and texture expressivity, and
develop a self-supervised generative model to jointly synthesize expressive,
textured, and articulated faces in 3D. We enable high-quality texture
generation for 3D faces by adversarial self-supervised training, guided by
differentiable rendering against collections of real RGB images. Controllable
editing and manipulation are given by language prompts to adapt texture and
expression of the 3D morphable model. To this end, we propose a neural network
that predicts both texture and expression latent codes of the morphable model.
Our model is trained in a self-supervised fashion by exploiting differentiable
rendering and losses based on a pre-trained CLIP model. Once trained, our model
jointly predicts face textures in UV-space, along with expression parameters to
capture both geometry and texture changes in facial expressions in a single
forward pass. We further show the applicability of our method to generate
temporally changing textures for a given animation sequence.
Related papers
- TADA! Text to Animatable Digital Avatars [57.52707683788961]
TADA takes textual descriptions and produces expressive 3D avatars with high-quality geometry and lifelike textures.
We derive an optimizable high-resolution body model from SMPL-X with 3D displacements and a texture map.
We render normals and RGB images of the generated character and exploit their latent embeddings in the SDS training process.
arXiv Detail & Related papers (2023-08-21T17:59:10Z) - Articulated 3D Head Avatar Generation using Text-to-Image Diffusion
Models [107.84324544272481]
The ability to generate diverse 3D articulated head avatars is vital to a plethora of applications, including augmented reality, cinematography, and education.
Recent work on text-guided 3D object generation has shown great promise in addressing these needs.
We show that our diffusion-based articulated head avatars outperform state-of-the-art approaches for this task.
arXiv Detail & Related papers (2023-07-10T19:15:32Z) - Single-Shot Implicit Morphable Faces with Consistent Texture
Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.
Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z) - TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision [114.56048848216254]
We present a novel framework, TAPS3D, to train a text-guided 3D shape generator with pseudo captions.
Based on rendered 2D images, we retrieve relevant words from the CLIP vocabulary and construct pseudo captions using templates.
Our constructed captions provide high-level semantic supervision for generated 3D shapes.
arXiv Detail & Related papers (2023-03-23T13:53:16Z) - CGOF++: Controllable 3D Face Synthesis with Conditional Generative
Occupancy Fields [52.14985242487535]
We propose a new conditional 3D face synthesis framework, which enables 3D controllability over generated face images.
At its core is a conditional Generative Occupancy Field (cGOF++) that effectively enforces the shape of the generated face to conform to a given 3D Morphable Model (3DMM) mesh.
Experiments validate the effectiveness of the proposed method and show more precise 3D controllability than state-of-the-art 2D-based controllable face synthesis methods.
arXiv Detail & Related papers (2022-11-23T19:02:50Z) - Next3D: Generative Neural Texture Rasterization for 3D-Aware Head
Avatars [36.4402388864691]
3D-aware generative adversarial networks (GANs) synthesize high-fidelity and multi-view-consistent facial images using only collections of single-view 2D imagery.
Recent efforts incorporate 3D Morphable Face Model (3DMM) to describe deformation in generative radiance fields either explicitly or implicitly.
We propose a novel 3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images.
arXiv Detail & Related papers (2022-11-21T06:40:46Z) - Controllable 3D Generative Adversarial Face Model via Disentangling
Shape and Appearance [63.13801759915835]
3D face modeling has been an active area of research in computer vision and computer graphics.
This paper proposes a new 3D face generative model that can decouple identity and expression.
arXiv Detail & Related papers (2022-08-30T13:40:48Z) - Text to Mesh Without 3D Supervision Using Limit Subdivision [13.358081015190255]
We present a technique for zero-shot generation of a 3D model using only a target text prompt.
We rely on a pre-trained CLIP model that compares the input text prompt with differentiably rendered images of our 3D model.
arXiv Detail & Related papers (2022-03-24T20:36:28Z) - Text and Image Guided 3D Avatar Generation and Manipulation [0.0]
We propose a novel 3D manipulation method that can manipulate both the shape and texture of the model using text or image-based prompts such as 'a young face' or 'a surprised face'
Our method requires only 5 minutes per manipulation, and we demonstrate the effectiveness of our approach with extensive results and comparisons.
arXiv Detail & Related papers (2022-02-12T14:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.