Human Image Generation: A Comprehensive Survey
- URL: http://arxiv.org/abs/2212.08896v3
- Date: Fri, 24 May 2024 03:33:47 GMT
- Title: Human Image Generation: A Comprehensive Survey
- Authors: Zhen Jia, Zhang Zhang, Liang Wang, Tieniu Tan,
- Abstract summary: In this paper, we divide human image generation techniques into three paradigms, i.e., data-driven methods, knowledge-guided methods and hybrid methods.
The advantages and characteristics of different methods are summarized in terms of model architectures.
Due to the wide application potentials, the typical downstream usages of synthesized human images are covered.
- Score: 44.204029557298476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image and video synthesis has become a blooming topic in computer vision and machine learning communities along with the developments of deep generative models, due to its great academic and application value. Many researchers have been devoted to synthesizing high-fidelity human images as one of the most commonly seen object categories in daily lives, where a large number of studies are performed based on various models, task settings and applications. Thus, it is necessary to give a comprehensive overview on these variant methods on human image generation. In this paper, we divide human image generation techniques into three paradigms, i.e., data-driven methods, knowledge-guided methods and hybrid methods. For each paradigm, the most representative models and the corresponding variants are presented, where the advantages and characteristics of different methods are summarized in terms of model architectures. Besides, the main public human image datasets and evaluation metrics in the literature are summarized. Furthermore, due to the wide application potentials, the typical downstream usages of synthesized human images are covered. Finally, the challenges and potential opportunities of human image generation are discussed to shed light on future research.
Related papers
- Single Image, Any Face: Generalisable 3D Face Generation [59.9369171926757]
We propose a novel model, Gen3D-Face, which generates 3D human faces with unconstrained single image input.
To the best of our knowledge, this is the first attempt and benchmark for creating photorealistic 3D human face avatars from single images.
arXiv Detail & Related papers (2024-09-25T14:56:37Z) - Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors [47.62426718293504]
HumanSplat predicts the 3D Gaussian Splatting properties of any human from a single input image.
HumanSplat surpasses existing state-of-the-art methods in achieving photorealistic novel-view synthesis.
arXiv Detail & Related papers (2024-06-18T10:05:33Z) - Multi Positive Contrastive Learning with Pose-Consistent Generated Images [0.873811641236639]
We propose the generation of visually distinct images with identical human poses.
We then propose a novel multi-positive contrastive learning, which optimally utilize the previously generated images.
Despite using only less than 1% amount of data compared to current state-of-the-art method, GenPoCCL captures structural features of the human body more effectively.
arXiv Detail & Related papers (2024-04-04T07:26:26Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion [114.15397904945185]
We propose a unified framework, HyperHuman, that generates in-the-wild human images of high realism and diverse layouts.
Our model enforces the joint learning of image appearance, spatial relationship, and geometry in a unified network.
Our framework yields the state-of-the-art performance, generating hyper-realistic human images under diverse scenarios.
arXiv Detail & Related papers (2023-10-12T17:59:34Z) - Limitations of Face Image Generation [12.11955119100926]
We study the efficacy and shortcomings of generative models in the context of face generation.
We identify several limitations of face image generation that include faithfulness to the text prompt, demographic disparities, and distributional shifts.
We present an analytical model that provides insights into how training data selection contributes to the performance of generative models.
arXiv Detail & Related papers (2023-09-13T19:33:26Z) - Image Synthesis with Adversarial Networks: a Comprehensive Survey and
Case Studies [41.00383742615389]
Generative Adversarial Networks (GANs) have been extremely successful in various application domains such as computer vision, medicine, and natural language processing.
GANs are powerful models for learning complex distributions to synthesize semantically meaningful samples.
Given the current fast GANs development, in this survey, we provide a comprehensive review of adversarial models for image synthesis.
arXiv Detail & Related papers (2020-12-26T13:30:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.