StyleGAN-Human: A Data-Centric Odyssey of Human Generation
- URL: http://arxiv.org/abs/2204.11823v1
- Date: Mon, 25 Apr 2022 17:55:08 GMT
- Title: StyleGAN-Human: A Data-Centric Odyssey of Human Generation
- Authors: Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen
Change Loy, Wayne Wu, Ziwei Liu
- Abstract summary: This work takes a data-centric perspective and investigates multiple critical aspects in "data engineering"
We collect and annotate a large-scale human image dataset with over 230K samples capturing diverse poses and textures.
We rigorously investigate three essential factors in data engineering for StyleGAN-based human generation, namely data size, data distribution, and data alignment.
- Score: 96.7080874757475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unconditional human image generation is an important task in vision and
graphics, which enables various applications in the creative industry. Existing
studies in this field mainly focus on "network engineering" such as designing
new components and objective functions. This work takes a data-centric
perspective and investigates multiple critical aspects in "data engineering",
which we believe would complement the current practice. To facilitate a
comprehensive study, we collect and annotate a large-scale human image dataset
with over 230K samples capturing diverse poses and textures. Equipped with this
large dataset, we rigorously investigate three essential factors in data
engineering for StyleGAN-based human generation, namely data size, data
distribution, and data alignment. Extensive experiments reveal several valuable
observations w.r.t. these aspects: 1) Large-scale data, more than 40K images,
are needed to train a high-fidelity unconditional human generation model with
vanilla StyleGAN. 2) A balanced training set helps improve the generation
quality with rare face poses compared to the long-tailed counterpart, whereas
simply balancing the clothing texture distribution does not effectively bring
an improvement. 3) Human GAN models with body centers for alignment outperform
models trained using face centers or pelvis points as alignment anchors. In
addition, a model zoo and human editing applications are demonstrated to
facilitate future research in the community.
Related papers
- Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models [0.65268245109828]
We introduce the notion of contextual diversity for active learning CDAL.
We propose a data repair algorithm to curate contextually fair data to reduce model bias.
We are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads.
arXiv Detail & Related papers (2024-11-04T09:43:33Z) - A Model Generalization Study in Localizing Indoor Cows with COw LOcalization (COLO) dataset [0.0]
This study investigates the generalization capabilities of YOLOv8 and YOLOv9 models for cow detection in indoor free-stall barn settings.
We explore three key hypotheses: (1) Model generalization is equally influenced by changes in lighting conditions and camera angles; (2) Higher model complexity guarantees better generalization performance; (3) Fine-tuning with custom initial weights trained on relevant tasks always brings advantages to detection tasks.
arXiv Detail & Related papers (2024-07-29T18:49:58Z) - 3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models [52.96248836582542]
We propose an effective approach based on recent diffusion models, termed HumanWild, which can effortlessly generate human images and corresponding 3D mesh annotations.
By exclusively employing generative models, we generate large-scale in-the-wild human images and high-quality annotations, eliminating the need for real-world data collection.
arXiv Detail & Related papers (2024-03-17T06:31:16Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human
Generation [59.77275587857252]
A holistic human dataset inevitably has insufficient and low-resolution information on local parts.
We propose to use multi-source datasets with various resolution images to jointly learn a high-resolution human generative model.
arXiv Detail & Related papers (2023-09-25T17:58:46Z) - SynBody: Synthetic Dataset with Layered Human Models for 3D Human
Perception and Modeling [93.60731530276911]
We introduce a new synthetic dataset, SynBody, with three appealing features.
The dataset comprises 1.2M images with corresponding accurate 3D annotations, covering 10,000 human body models, 1,187 actions, and various viewpoints.
arXiv Detail & Related papers (2023-03-30T13:30:12Z) - Recovering 3D Human Mesh from Monocular Images: A Survey [49.00136388529404]
Estimating human pose and shape from monocular images is a long-standing problem in computer vision.
This survey focuses on the task of monocular 3D human mesh recovery.
arXiv Detail & Related papers (2022-03-03T18:56:08Z) - Methodology for Building Synthetic Datasets with Virtual Humans [1.5556923898855324]
Large datasets can be used for improved, targeted training of deep neural networks.
In particular, we make use of a 3D morphable face model for the rendering of multiple 2D images across a dataset of 100 synthetic identities.
arXiv Detail & Related papers (2020-06-21T10:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.