Towards Better Adversarial Synthesis of Human Images from Text
- URL: http://arxiv.org/abs/2107.01869v1
- Date: Mon, 5 Jul 2021 08:47:51 GMT
- Title: Towards Better Adversarial Synthesis of Human Images from Text
- Authors: Rania Briq, Pratika Kochar, Juergen Gall
- Abstract summary: The model's performance is evaluated on the COCO dataset.
We show how using such a shape as input to image synthesis frameworks helps to constrain the network to synthesize humans with realistic human shapes.
- Score: 19.743502366461982
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes an approach that generates multiple 3D human meshes from
text. The human shapes are represented by 3D meshes based on the SMPL model.
The model's performance is evaluated on the COCO dataset, which contains
challenging human shapes and intricate interactions between individuals. The
model is able to capture the dynamics of the scene and the interactions between
individuals based on text. We further show how using such a shape as input to
image synthesis frameworks helps to constrain the network to synthesize humans
with realistic human shapes.
Related papers
- FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors [4.697267141773321]
We present DreamHOI, a novel method for zero-shot synthesis of human-object interactions (HOIs)
We leverage text-to-image diffusion models trained on billions of image-caption pairs to generate realistic HOIs.
We validate our approach through extensive experiments, demonstrating its effectiveness in generating realistic HOIs.
arXiv Detail & Related papers (2024-09-12T17:59:49Z) - Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance [25.346255905155424]
We introduce a methodology for human image animation by leveraging a 3D human parametric model within a latent diffusion framework.
By representing the 3D human parametric model as the motion guidance, we can perform parametric shape alignment of the human body between the reference image and the source video motion.
Our approach also exhibits superior generalization capabilities on the proposed in-the-wild dataset.
arXiv Detail & Related papers (2024-03-21T18:52:58Z) - 3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models [52.96248836582542]
We propose an effective approach based on recent diffusion models, termed HumanWild, which can effortlessly generate human images and corresponding 3D mesh annotations.
By exclusively employing generative models, we generate large-scale in-the-wild human images and high-quality annotations, eliminating the need for real-world data collection.
arXiv Detail & Related papers (2024-03-17T06:31:16Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - Learning Dense Correspondence from Synthetic Environments [27.841736037738286]
Existing methods map manually labelled human pixels in real 2D images onto the 3D surface, which is prone to human error.
We propose to solve the problem of data scarcity by training 2D-3D human mapping algorithms using automatically generated synthetic data.
arXiv Detail & Related papers (2022-03-24T08:13:26Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - Creating and Reenacting Controllable 3D Humans with Differentiable
Rendering [3.079885946230076]
This paper proposes a new end-to-end neural rendering architecture to transfer appearance and reenact human actors.
Our method leverages a carefully designed graph convolutional network (GCN) to model the human body manifold structure.
By taking advantages of both different synthesisiable rendering and the 3D parametric model, our method is fully controllable.
arXiv Detail & Related papers (2021-10-22T12:40:09Z) - Detailed Avatar Recovery from Single Image [50.82102098057822]
This paper presents a novel framework to recover emphdetailed avatar from a single image.
We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation framework.
Our method can restore detailed human body shapes with complete textures beyond skinned models.
arXiv Detail & Related papers (2021-08-06T03:51:26Z) - Learning Transferable Kinematic Dictionary for 3D Human Pose and Shape
Reconstruction [15.586347115568973]
We propose a kinematic dictionary, which explicitly regularizes the solution space of relative 3D rotations of human joints.
Our method achieves end-to-end 3D reconstruction without the need of using any shape annotations during the training of neural networks.
The proposed method achieves competitive results on large-scale datasets including Human3.6M, MPI-INF-3DHP, and LSP.
arXiv Detail & Related papers (2021-04-02T09:24:29Z) - S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling [103.65625425020129]
We represent the pedestrian's shape, pose and skinning weights as neural implicit functions that are directly learned from data.
We demonstrate the effectiveness of our approach on various datasets and show that our reconstructions outperform existing state-of-the-art methods.
arXiv Detail & Related papers (2021-01-17T02:16:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.