Related papers: MagicAvatar: Multimodal Avatar Generation and Animation

MagicAvatar: Multimodal Avatar Generation and Animation

URL: http://arxiv.org/abs/2308.14748v1
Date: Mon, 28 Aug 2023 17:56:18 GMT
Title: MagicAvatar: Multimodal Avatar Generation and Animation
Authors: Jianfeng Zhang and Hanshu Yan and Zhongcong Xu and Jiashi Feng and Jun Hao Liew
Abstract summary: MagicAvatar is a framework for multimodal video generation and animation of human avatars. It disentangles avatar video generation into two stages: multimodal-to-motion and motion-to-video generation. We demonstrate the flexibility of MagicAvatar through various applications, including text-guided and video-guided avatar generation.
Score: 70.55750617502696
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This report presents MagicAvatar, a framework for multimodal video generation and animation of human avatars. Unlike most existing methods that generate avatar-centric videos directly from multimodal inputs (e.g., text prompts), MagicAvatar explicitly disentangles avatar video generation into two stages: (1) multimodal-to-motion and (2) motion-to-video generation. The first stage translates the multimodal inputs into motion/ control signals (e.g., human pose, depth, DensePose); while the second stage generates avatar-centric video guided by these motion signals. Additionally, MagicAvatar supports avatar animation by simply providing a few images of the target person. This capability enables the animation of the provided human identity according to the specific motion derived from the first stage. We demonstrate the flexibility of MagicAvatar through various applications, including text-guided and video-guided avatar generation, as well as multimodal avatar animation.

Related papers

SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents [91.26239311240873]
SmartAvatar is a vision-language-agent-driven framework for generating fully rigged, animation-ready 3D human avatars.<n>A key innovation is an autonomous verification loop, where the agent renders draft avatars.<n>The generated avatars are fully rigged and support pose manipulation with consistent identity and appearance.
arXiv Detail & Related papers (2025-06-05T03:49:01Z)
Multimodal Generation of Animatable 3D Human Models with AvatarForge [67.31920821192323]
AvatarForge is a framework for generating animatable 3D human avatars from text or image inputs using AI-driven procedural generation. Our evaluations show that AvatarForge outperforms state-of-the-art methods in both text- and image-to-avatar generation.
arXiv Detail & Related papers (2025-03-11T08:29:18Z)
Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior [31.780579293685797]
We present Vid2Avatar-Pro, a method to create photorealistic and animatable 3D human avatars from monocular in-the-wild videos.
arXiv Detail & Related papers (2025-03-03T14:45:35Z)
Disentangled Clothed Avatar Generation with Layered Representation [5.775559930050691]
Clothed avatar generation has wide applications in virtual and augmented reality, filmmaking, and more. Previous methods have achieved success in generating diverse digital avatars, however, generating avatars with disentangled components has long been a challenge. We propose LayerAvatar, the first feed-forward diffusion-based method for generating component-disentangled clothed avatars.
arXiv Detail & Related papers (2025-01-08T17:27:27Z)
EgoAvatar: Egocentric View-Driven and Photorealistic Full-body Avatars [56.56236652774294]
We propose a person-specific egocentric telepresence approach, which jointly models the photoreal digital avatar while also driving it from a single egocentric video. Our experiments demonstrate a clear step towards egocentric and photoreal telepresence as our method outperforms baselines as well as competing methods.
arXiv Detail & Related papers (2024-09-22T22:50:27Z)
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation [39.235962838952624]
In this paper, we propose a novel text-guided approach for generating emotionally expressive 2D avatars. Our framework, named InstructAvatar, leverages a natural language interface to control the emotion as well as the facial motion of avatars. Experimental results demonstrate that InstructAvatar produces results that align well with both conditions.
arXiv Detail & Related papers (2024-05-24T17:53:54Z)
Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion [39.456643736018435]
We propose a novel agent-based approach named Motion Avatar, which allows for the automatic generation of high-quality customizable human and animal avatars. Secondly, we introduced a LLM planner that coordinates both motion and avatar generation, which transforms a discriminative planning into a customizable Q&A fashion. Finally, we presented an animal motion dataset named Zoo-300K, comprising approximately 300,000 text-motion pairs across 65 animal categories.
arXiv Detail & Related papers (2024-05-18T13:21:14Z)
AniArtAvatar: Animatable 3D Art Avatar from a Single Image [0.0]
We present a novel approach for generating animatable 3D-aware art avatars from a single image. We use a view-conditioned 2D diffusion model to synthesize multi-view images from a single art portrait with a neutral expression. For avatar animation, we extract control points, transfer the motion with these points, and deform the implicit canonical space.
arXiv Detail & Related papers (2024-03-26T12:08:04Z)
DivAvatar: Diverse 3D Avatar Generation with a Single Prompt [95.9978722953278]
DivAvatar is a framework that generates diverse avatars from a single text prompt. It has two key designs that help achieve generation diversity and visual quality. Extensive experiments show that DivAvatar is highly versatile in generating avatars of diverse appearances.
arXiv Detail & Related papers (2024-02-27T08:10:31Z)
AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text [71.09533176800707]
AvatarStudio is a coarse-to-fine generative model that generates explicit textured 3D meshes for animatable human avatars. By effectively leveraging the synergy between the articulated mesh representation and the DensePose-conditional diffusion model, AvatarStudio can create high-quality avatars.
arXiv Detail & Related papers (2023-11-29T18:59:32Z)
GAIA: Zero-shot Talking Avatar Generation [64.78978434650416]
We introduce GAIA (Generative AI for Avatar), which eliminates the domain priors in talking avatar generation. GAIA beats previous baseline models in terms of naturalness, diversity, lip-sync quality, and visual quality. It is general and enables different applications like controllable talking avatar generation and text-instructed avatar generation.
arXiv Detail & Related papers (2023-11-26T08:04:43Z)
Physics-based Motion Retargeting from Sparse Inputs [73.94570049637717]
Commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose. We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies. We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available.
arXiv Detail & Related papers (2023-07-04T21:57:05Z)
SwiftAvatar: Efficient Auto-Creation of Parameterized Stylized Character on Arbitrary Avatar Engines [34.645129752596915]
We propose SwiftAvatar, a novel avatar auto-creation framework. We synthesize data in high-quality as many as possible, consisting of avatar vectors and their corresponding realistic faces. Our experiments demonstrate the effectiveness and efficiency of SwiftAvatar on two different avatar engines.
arXiv Detail & Related papers (2023-01-19T16:14:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.