Related papers: Disentangled Clothed Avatar Generation with Layered Representation

Disentangled Clothed Avatar Generation with Layered Representation

URL: http://arxiv.org/abs/2501.04631v1
Date: Wed, 08 Jan 2025 17:27:27 GMT
Title: Disentangled Clothed Avatar Generation with Layered Representation
Authors: Weitian Zhang, Sijing Wu, Manwen Liao, Yichao Yan,
Abstract summary: Clothed avatar generation has wide applications in virtual and augmented reality, filmmaking, and more.<n>Previous methods have achieved success in generating diverse digital avatars, however, generating avatars with disentangled components has long been a challenge.<n>We propose LayerAvatar, the first feed-forward diffusion-based method for generating component-disentangled clothed avatars.
Score: 5.775559930050691
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Clothed avatar generation has wide applications in virtual and augmented reality, filmmaking, and more. Previous methods have achieved success in generating diverse digital avatars, however, generating avatars with disentangled components (\eg, body, hair, and clothes) has long been a challenge. In this paper, we propose LayerAvatar, the first feed-forward diffusion-based method for generating component-disentangled clothed avatars. To achieve this, we first propose a layered UV feature plane representation, where components are distributed in different layers of the Gaussian-based UV feature plane with corresponding semantic labels. This representation supports high-resolution and real-time rendering, as well as expressive animation including controllable gestures and facial expressions. Based on the well-designed representation, we train a single-stage diffusion model and introduce constrain terms to address the severe occlusion problem of the innermost human body layer. Extensive experiments demonstrate the impressive performances of our method in generating disentangled clothed avatars, and we further explore its applications in component transfer. The project page is available at: https://olivia23333.github.io/LayerAvatar/

Related papers

EVA: Expressive Virtual Avatars from Multi-view Videos [51.33851869426057]
We introduce Expressive Virtual Avatars (EVA), an actor-specific, fully controllable, and expressive human avatar framework.<n>EVA achieves high-fidelity, lifelike renderings in real time while enabling independent control of facial expressions, body movements, and hand gestures.<n>This work represents a significant advancement towards fully drivable digital human models.
arXiv Detail & Related papers (2025-05-21T11:22:52Z)
FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images [74.86864398919467]
We present a novel method for reconstructing personalized 3D human avatars with realistic animation from only a few images. We learn a universal prior from over a thousand clothed humans to achieve instant feedforward generation and zero-shot generalization. Our method generates more authentic reconstruction and animation than state-of-the-arts, and can be directly generalized to inputs from casually taken phone photos.
arXiv Detail & Related papers (2025-03-24T23:20:47Z)
Multimodal Generation of Animatable 3D Human Models with AvatarForge [67.31920821192323]
AvatarForge is a framework for generating animatable 3D human avatars from text or image inputs using AI-driven procedural generation. Our evaluations show that AvatarForge outperforms state-of-the-art methods in both text- and image-to-avatar generation.
arXiv Detail & Related papers (2025-03-11T08:29:18Z)
Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment. We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images. We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z)
DivAvatar: Diverse 3D Avatar Generation with a Single Prompt [95.9978722953278]
DivAvatar is a framework that generates diverse avatars from a single text prompt. It has two key designs that help achieve generation diversity and visual quality. Extensive experiments show that DivAvatar is highly versatile in generating avatars of diverse appearances.
arXiv Detail & Related papers (2024-02-27T08:10:31Z)
AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text [71.09533176800707]
AvatarStudio is a coarse-to-fine generative model that generates explicit textured 3D meshes for animatable human avatars. By effectively leveraging the synergy between the articulated mesh representation and the DensePose-conditional diffusion model, AvatarStudio can create high-quality avatars.
arXiv Detail & Related papers (2023-11-29T18:59:32Z)
MagicAvatar: Multimodal Avatar Generation and Animation [70.55750617502696]
MagicAvatar is a framework for multimodal video generation and animation of human avatars. It disentangles avatar video generation into two stages: multimodal-to-motion and motion-to-video generation. We demonstrate the flexibility of MagicAvatar through various applications, including text-guided and video-guided avatar generation.
arXiv Detail & Related papers (2023-08-28T17:56:18Z)
AvatarFusion: Zero-shot Generation of Clothing-Decoupled 3D Avatars Using 2D Diffusion [34.609403685504944]
We present AvatarFusion, a framework for zero-shot text-to-avatar generation. We use a latent diffusion model to provide pixel-level guidance for generating human-realistic avatars. We also introduce a novel optimization method, called Pixel-Semantics Difference-Sampling (PS-DS), which semantically separates the generation of body and clothes.
arXiv Detail & Related papers (2023-07-13T02:19:56Z)
AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images. Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models. We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z)
PointAvatar: Deformable Point-based Head Avatars from Videos [103.43941945044294]
PointAvatar is a deformable point-based representation that disentangles the source color into intrinsic albedo and normal-dependent shading. We show that our method is able to generate animatable 3D avatars using monocular videos from multiple sources.
arXiv Detail & Related papers (2022-12-16T10:05:31Z)
Explicit Clothing Modeling for an Animatable Full-Body Avatar [21.451440299450592]
We build an animatable clothed body avatar with an explicit representation of the clothing on the upper body from multi-view captured videos. To learn the interaction between the body dynamics and clothing states, we use a temporal convolution network to predict the clothing latent code. We show photorealistic animation output for three different actors, and demonstrate the advantage of our clothed-body avatars over single-layer avatars.
arXiv Detail & Related papers (2021-06-28T17:58:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.