StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation
- URL: http://arxiv.org/abs/2508.11203v1
- Date: Fri, 15 Aug 2025 04:29:46 GMT
- Title: StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation
- Authors: Seungmi Lee, Kwan Yun, Junyong Noh,
- Abstract summary: StyleMM is a framework that can construct a stylized 3D Morphable Model (3DMM) based on user-defined text descriptions.<n>Our approach fine-tunes these models using stylized facial images generated via text-guided image-to-image (i2i) translation.<n>Our approach outperforms state-of-the-art methods in terms of identity-level facial diversity and stylization capability.
- Score: 4.500637354443275
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce StyleMM, a novel framework that can construct a stylized 3D Morphable Model (3DMM) based on user-defined text descriptions specifying a target style. Building upon a pre-trained mesh deformation network and a texture generator for original 3DMM-based realistic human faces, our approach fine-tunes these models using stylized facial images generated via text-guided image-to-image (i2i) translation with a diffusion model, which serve as stylization targets for the rendered mesh. To prevent undesired changes in identity, facial alignment, or expressions during i2i translation, we introduce a stylization method that explicitly preserves the facial attributes of the source image. By maintaining these critical attributes during image stylization, the proposed approach ensures consistent 3D style transfer across the 3DMM parameter space through image-based training. Once trained, StyleMM enables feed-forward generation of stylized face meshes with explicit control over shape, expression, and texture parameters, producing meshes with consistent vertex connectivity and animatability. Quantitative and qualitative evaluations demonstrate that our approach outperforms state-of-the-art methods in terms of identity-level facial diversity and stylization capability. The code and videos are available at [kwanyun.github.io/stylemm_page](kwanyun.github.io/stylemm_page).
Related papers
- StyleSculptor: Zero-Shot Style-Controllable 3D Asset Generation with Texture-Geometry Dual Guidance [50.207322685527394]
StyleSculptor is a training-free approach for generating style-guided 3D assets from a content image and one or more style images.<n>It achieves style-guided 3D generation in a zero-shot manner, enabling fine-grained 3D style control.<n>In experiments, StyleSculptor outperforms existing baseline methods in producing high-fidelity 3D assets.
arXiv Detail & Related papers (2025-09-16T17:55:20Z) - Improved 3D Scene Stylization via Text-Guided Generative Image Editing with Region-Based Control [47.14550252881733]
We introduce techniques that enhance the quality of 3D stylization while maintaining view consistency and providing optional region-controlled style transfer.<n>Our method achieves stylization by re-training an initial 3D representation using stylized multi-view 2D images of the source views.<n>We propose Multi-Region Importance-Weighted Sliced Wasserstein Distance Loss, allowing styles to be applied to distinct image regions using segmentation masks from off-the-shelf models.
arXiv Detail & Related papers (2025-09-04T15:01:01Z) - StyleTex: Style Image-Guided Texture Generation for 3D Models [8.764938886974482]
Style-guided texture generation aims to generate a texture that is harmonious with both the style of the reference image and the geometry of the input mesh.
We introduce StyleTex, an innovative diffusion-model-based framework for creating stylized textures for 3D models.
arXiv Detail & Related papers (2024-11-01T06:57:04Z) - Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation [14.079043195485601]
We present a method to generate 3D objects in styles.<n>Our method takes a text prompt and a style reference image as input and reconstructs a neural radiance field to synthesize a 3D model.
arXiv Detail & Related papers (2024-06-05T16:27:34Z) - LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example [5.999050119438177]
We propose a method that can produce a highly stylized 3D face model with desired topology.
Our methods train a surface deformation network with 3DMM and translate its domain to the target style using a differentiable meshes and directional CLIP losses.
The network achieves stylization of the 3D face mesh by mimicking the style of the target using a differentiable meshes and directional CLIP losses.
arXiv Detail & Related papers (2024-03-22T14:20:54Z) - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with
2D Diffusion Models [102.75875255071246]
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community.
We propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models.
arXiv Detail & Related papers (2023-11-09T15:51:27Z) - Single-Shot Implicit Morphable Faces with Consistent Texture
Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.
Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z) - DreamStone: Image as Stepping Stone for Text-Guided 3D Shape Generation [105.97545053660619]
We present a new text-guided 3D shape generation approach DreamStone.
It uses images as a stepping stone to bridge the gap between text and shape modalities for generating 3D shapes without requiring paired text and 3D data.
Our approach is generic, flexible, and scalable, and it can be easily integrated with various SVR models to expand the generative space and improve the generative fidelity.
arXiv Detail & Related papers (2023-03-24T03:56:23Z) - ClipFace: Text-guided Editing of Textured 3D Morphable Models [33.83015491013442]
We propose ClipFace, a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces.
We employ user-friendly language prompts to enable control of the expressions as well as appearance of 3D faces.
Our model is trained in a self-supervised fashion by exploiting differentiable rendering and losses based on a pre-trained CLIP model.
arXiv Detail & Related papers (2022-12-02T19:01:08Z) - Exemplar-Based 3D Portrait Stylization [23.585334925548064]
We present the first framework for one-shot 3D portrait style transfer.
It can generate 3D face models with both the geometry exaggerated and the texture stylized.
Our method achieves robustly good results on different artistic styles and outperforms existing methods.
arXiv Detail & Related papers (2021-04-29T17:59:54Z) - StyleRig: Rigging StyleGAN for 3D Control over Portrait Images [81.43265493604302]
StyleGAN generates portrait images of faces with eyes, teeth, hair and context (neck, shoulders, background)
StyleGAN lacks a rig-like control over semantic face parameters that are interpretable in 3D, such as face pose, expressions, and scene illumination.
We present the first method to provide a face rig-like control over a pretrained and fixed StyleGAN via a 3DMM.
arXiv Detail & Related papers (2020-03-31T21:20:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.