Diffusion-HPC: Synthetic Data Generation for Human Mesh Recovery in
Challenging Domains
- URL: http://arxiv.org/abs/2303.09541v2
- Date: Sun, 31 Dec 2023 00:17:33 GMT
- Title: Diffusion-HPC: Synthetic Data Generation for Human Mesh Recovery in
Challenging Domains
- Authors: Zhenzhen Weng, Laura Bravo-S\'anchez, Serena Yeung-Levy
- Abstract summary: We propose a text-conditioned method that generates photo-realistic images with plausible posed humans by injecting prior knowledge about human body structure.
Our generated images are accompanied by 3D meshes that serve as ground truths for improving Human Mesh Recovery tasks.
- Score: 2.7624021966289605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent text-to-image generative models have exhibited remarkable abilities in
generating high-fidelity and photo-realistic images. However, despite the
visually impressive results, these models often struggle to preserve plausible
human structure in the generations. Due to this reason, while generative models
have shown promising results in aiding downstream image recognition tasks by
generating large volumes of synthetic data, they are not suitable for improving
downstream human pose perception and understanding. In this work, we propose a
Diffusion model with Human Pose Correction (Diffusion-HPC), a text-conditioned
method that generates photo-realistic images with plausible posed humans by
injecting prior knowledge about human body structure. Our generated images are
accompanied by 3D meshes that serve as ground truths for improving Human Mesh
Recovery tasks, where a shortage of 3D training data has long been an issue.
Furthermore, we show that Diffusion-HPC effectively improves the realism of
human generations under varying conditioning strategies.
Related papers
- Detecting Human Artifacts from Text-to-Image Models [16.261759535724778]
This dataset contains images containing images containing images containing a human body.
Images include images of poorly generated human bodies, including distorted and missing parts of the human body.
arXiv Detail & Related papers (2024-11-21T05:02:13Z) - PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion [43.850899288337025]
PSHuman is a novel framework that explicitly reconstructs human meshes utilizing priors from the multiview diffusion model.
It is found that directly applying multiview diffusion on single-view human images leads to severe geometric distortions.
To enhance cross-view body shape consistency of varied human poses, we condition the generative model on parametric models like SMPL-X.
arXiv Detail & Related papers (2024-09-16T10:13:06Z) - Boost Your Own Human Image Generation Model via Direct Preference Optimization with AI Feedback [5.9726297901501475]
We introduce a novel approach tailored specifically for human image generation utilizing Direct Preference Optimization (DPO)
Specifically, we introduce an efficient method for constructing a specialized DPO dataset for training human image generation models without the need for costly human feedback.
Our method demonstrates its versatility and effectiveness in generating human images, including personalized text-to-image generation.
arXiv Detail & Related papers (2024-05-30T16:18:05Z) - 3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models [52.96248836582542]
We propose an effective approach based on recent diffusion models, termed HumanWild, which can effortlessly generate human images and corresponding 3D mesh annotations.
By exclusively employing generative models, we generate large-scale in-the-wild human images and high-quality annotations, eliminating the need for real-world data collection.
arXiv Detail & Related papers (2024-03-17T06:31:16Z) - InceptionHuman: Controllable Prompt-to-NeRF for Photorealistic 3D Human Generation [61.62346472443454]
InceptionHuman is a prompt-to-NeRF framework that allows easy control via a combination of prompts in different modalities to generate photorealistic 3D humans.
InceptionHuman achieves consistent 3D human generation within a progressively refined NeRF space.
arXiv Detail & Related papers (2023-11-27T15:49:41Z) - HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion [114.15397904945185]
We propose a unified framework, HyperHuman, that generates in-the-wild human images of high realism and diverse layouts.
Our model enforces the joint learning of image appearance, spatial relationship, and geometry in a unified network.
Our framework yields the state-of-the-art performance, generating hyper-realistic human images under diverse scenarios.
arXiv Detail & Related papers (2023-10-12T17:59:34Z) - Exploring the Robustness of Human Parsers Towards Common Corruptions [99.89886010550836]
We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models.
Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
arXiv Detail & Related papers (2023-09-02T13:32:14Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - Brain Imaging Generation with Latent Diffusion Models [2.200720122706913]
In this study, we explore using Latent Diffusion Models to generate synthetic images from high-resolution 3D brain images.
We found that our models created realistic data, and we could use the conditioning variables to control the data generation effectively.
arXiv Detail & Related papers (2022-09-15T09:16:21Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.