InfiniHuman: Infinite 3D Human Creation with Precise Control
- URL: http://arxiv.org/abs/2510.11650v1
- Date: Mon, 13 Oct 2025 17:29:55 GMT
- Title: InfiniHuman: Infinite 3D Human Creation with Precise Control
- Authors: Yuxuan Xue, Xianghui Xie, Margaret Kostyrko, Gerard Pons-Moll,
- Abstract summary: InfiniHuman is a framework that distills existing foundation models to produce richly annotated human data.<n>InfiniHumanData contains 111K identities spanning unprecedented diversity.<n>InfiniHumanGen enables fast, realistic, and precisely controllable avatar generation.
- Score: 25.397126937363012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating realistic and controllable 3D human avatars is a long-standing challenge, particularly when covering broad attribute ranges such as ethnicity, age, clothing styles, and detailed body shapes. Capturing and annotating large-scale human datasets for training generative models is prohibitively expensive and limited in scale and diversity. The central question we address in this paper is: Can existing foundation models be distilled to generate theoretically unbounded, richly annotated 3D human data? We introduce InfiniHuman, a framework that synergistically distills these models to produce richly annotated human data at minimal cost and with theoretically unlimited scalability. We propose InfiniHumanData, a fully automatic pipeline that leverages vision-language and image generation models to create a large-scale multi-modal dataset. User study shows our automatically generated identities are undistinguishable from scan renderings. InfiniHumanData contains 111K identities spanning unprecedented diversity. Each identity is annotated with multi-granularity text descriptions, multi-view RGB images, detailed clothing images, and SMPL body-shape parameters. Building on this dataset, we propose InfiniHumanGen, a diffusion-based generative pipeline conditioned on text, body shape, and clothing assets. InfiniHumanGen enables fast, realistic, and precisely controllable avatar generation. Extensive experiments demonstrate significant improvements over state-of-the-art methods in visual quality, generation speed, and controllability. Our approach enables high-quality avatar generation with fine-grained control at effectively unbounded scale through a practical and affordable solution. We will publicly release the automatic data generation pipeline, the comprehensive InfiniHumanData dataset, and the InfiniHumanGen models at https://yuxuan-xue.com/infini-human.
Related papers
- HuGeDiff: 3D Human Generation via Diffusion with Gaussian Splatting [33.9893684177763]
Current methods struggle with fine detail, accurate rendering of hands and faces, human realism, and controlability over appearance.<n>We present a weakly supervised pipeline that tries to address these challenges.<n>We demonstrate orders-of-magnitude speed-ups in 3D human generation compared to the state-of-the-art approaches.
arXiv Detail & Related papers (2025-06-04T18:11:23Z) - MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization [36.46025784260418]
We present MVHumanNet++, a dataset that comprises multi-view human action sequences of 4,500 human identities.<n>Our dataset contains 9,000 daily outfits, 60,000 motion sequences and 645 million frames with extensive annotations.
arXiv Detail & Related papers (2025-05-03T15:02:34Z) - SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets [72.26350984924129]
We propose a latent space generation paradigm for 3D human digitization.<n>We transform the ill-posed low-to-high-dimensional mapping problem into a learnable distribution shift.<n>We employ the multi-view optimization approach combined with synthetic data to construct the HGS-1M dataset.
arXiv Detail & Related papers (2025-04-09T15:38:18Z) - Multimodal Generation of Animatable 3D Human Models with AvatarForge [67.31920821192323]
AvatarForge is a framework for generating animatable 3D human avatars from text or image inputs using AI-driven procedural generation.<n>Our evaluations show that AvatarForge outperforms state-of-the-art methods in both text- and image-to-avatar generation.
arXiv Detail & Related papers (2025-03-11T08:29:18Z) - HINT: Learning Complete Human Neural Representations from Limited Viewpoints [69.76947323932107]
We propose a NeRF-based algorithm able to learn a detailed and complete human model from limited viewing angles.
As a result, our method can reconstruct complete humans even from a few viewing angles, increasing performance by more than 15% PSNR.
arXiv Detail & Related papers (2024-05-30T05:43:09Z) - 3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models [52.96248836582542]
We propose an effective approach based on recent diffusion models, termed HumanWild, which can effortlessly generate human images and corresponding 3D mesh annotations.
By exclusively employing generative models, we generate large-scale in-the-wild human images and high-quality annotations, eliminating the need for real-world data collection.
arXiv Detail & Related papers (2024-03-17T06:31:16Z) - CapHuman: Capture Your Moments in Parallel Universes [60.06408546134581]
We present a new framework named CapHuman.
CapHuman encodes identity features and then learns to align them into the latent space.
We introduce the 3D facial prior to equip our model with control over the human head in a flexible and 3D-consistent manner.
arXiv Detail & Related papers (2024-02-01T14:41:59Z) - MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human
Captures [44.172804112944625]
We present MVHumanNet, a dataset that comprises multi-view human action sequences of 4,500 human identities.
Our dataset contains 9,000 daily outfits, 60,000 motion sequences and 645 million extensive annotations, including human masks, camera parameters, 2D and 3D keypoints, SMPL/SMPLX parameters, and corresponding textual descriptions.
arXiv Detail & Related papers (2023-12-05T18:50:12Z) - gDNA: Towards Generative Detailed Neural Avatars [94.9804106939663]
We show that our model is able to generate natural human avatars wearing diverse and detailed clothing.
Our method can be used on the task of fitting human models to raw scans, outperforming the previous state-of-the-art.
arXiv Detail & Related papers (2022-01-11T18:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.