ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars
- URL: http://arxiv.org/abs/2505.10072v2
- Date: Thu, 24 Jul 2025 11:15:16 GMT
- Title: ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars
- Authors: Rui-Yang Ju, Sheng-Yen Huang, Yi-Ping Hung,
- Abstract summary: Toonify, a StyleGAN-based method, has become widely used for facial image stylization.<n>We propose an efficient two-stage framework, ToonifyGB, to extend Toonify for diverse stylized 3D head avatars.
- Score: 0.916825397273032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The introduction of 3D Gaussian blendshapes has enabled the real-time reconstruction of animatable head avatars from monocular video. Toonify, a StyleGAN-based method, has become widely used for facial image stylization. To extend Toonify for synthesizing diverse stylized 3D head avatars using Gaussian blendshapes, we propose an efficient two-stage framework, ToonifyGB. In Stage 1 (stylized video generation), we adopt an improved StyleGAN to generate the stylized video from the input video frames, which overcomes the limitation of cropping aligned faces at a fixed resolution as preprocessing for normal StyleGAN. This process provides a more stable stylized video, which enables Gaussian blendshapes to better capture the high-frequency details of the video frames, facilitating the synthesis of high-quality animations in the next stage. In Stage 2 (Gaussian blendshapes synthesis), our method learns a stylized neutral head model and a set of expression blendshapes from the generated stylized video. By combining the neutral head model with expression blendshapes, ToonifyGB can efficiently render stylized avatars with arbitrary expressions. We validate the effectiveness of ToonifyGB on benchmark datasets using two representative styles: Arcane and Pixar.
Related papers
- CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting [0.42881773214459123]
We introduce CLIPGaussians, the first unified style transfer framework that supports text- and image-guided stylization across multiple modalities.<n>Our method operates directly on Gaussian primitives and integrates into existing GS pipelines as a plug-in module.<n>We demonstrate superior style fidelity and consistency across all tasks, validating CLIPGaussians as a universal and efficient solution for multimodal style transfer.
arXiv Detail & Related papers (2025-05-28T20:41:24Z) - Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance [69.9745497000557]
We introduce Arc2Avatar, the first SDS-based method utilizing a human face foundation model as guidance with just a single image as input.<n>Our avatars maintain a dense correspondence with a human face mesh template, allowing blendshape-based expression generation.
arXiv Detail & Related papers (2025-01-09T17:04:33Z) - Generating Editable Head Avatars with 3D Gaussian GANs [57.51487984425395]
Traditional 3D-aware generative adversarial networks (GANs) achieve photorealistic and view-consistent 3D head synthesis.<n>We propose a novel approach that enhances the editability and animation control of 3D head avatars by incorporating 3D Gaussian Splatting (3DGS) as an explicit 3D representation.<n>Our approach delivers high-quality 3D-aware synthesis with state-of-the-art controllability.
arXiv Detail & Related papers (2024-12-26T10:10:03Z) - G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - 3D Gaussian Blendshapes for Head Avatar Animation [31.488663463060416]
We introduce 3D Gaussian blendshapes for modeling photorealistic head avatars.
The avatar model of an arbitrary expression can be effectively generated by combining the neutral model and expression blendshapes.
High-fidelity head avatar animations can be synthesized in real time using Gaussian splatting.
arXiv Detail & Related papers (2024-04-30T09:45:41Z) - Hybrid Explicit Representation for Ultra-Realistic Head Avatars [55.829497543262214]
We introduce a novel approach to creating ultra-realistic head avatars and rendering them in real-time.<n> UV-mapped 3D mesh is utilized to capture sharp and rich textures on smooth surfaces, while 3D Gaussian Splatting is employed to represent complex geometric structures.<n>Experiments that our modeled results exceed those of state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-18T04:01:26Z) - StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting [141.05924680451804]
StyleGaussian is a novel 3D style transfer technique.
It allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps)
arXiv Detail & Related papers (2024-03-12T16:44:52Z) - SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded
Gaussian Splatting [26.849406891462557]
We present SplattingAvatar, a hybrid 3D representation of human avatars with Gaussian Splatting embedded on a triangle mesh.
SplattingAvatar renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.
arXiv Detail & Related papers (2024-03-08T06:28:09Z) - GaussianStyle: Gaussian Head Avatar via StyleGAN [64.85782838199427]
We propose a novel framework that integrates the volumetric strengths of 3DGS with the powerful implicit representation of StyleGAN.
We show that our method achieves state-of-the-art performance in reenactment, novel view synthesis, and animation.
arXiv Detail & Related papers (2024-02-01T18:14:42Z) - Learning Naturally Aggregated Appearance for Efficient 3D Editing [90.57414218888536]
We learn the color field as an explicit 2D appearance aggregation, also called canonical image.<n>We complement the canonical image with a projection field that maps 3D points onto 2D pixels for texture query.<n>Our approach demonstrates remarkable efficiency by being at least 20 times faster per edit compared to existing NeRF-based editing methods.
arXiv Detail & Related papers (2023-12-11T18:59:31Z) - GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians [51.46168990249278]
We present an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video.
GustafAvatar is validated on both the public dataset and our collected dataset.
arXiv Detail & Related papers (2023-12-04T18:55:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.