From Real to Synthetic and Back: Synthesizing Training Data for
Multi-Person Scene Understanding
- URL: http://arxiv.org/abs/2006.02110v1
- Date: Wed, 3 Jun 2020 09:02:06 GMT
- Title: From Real to Synthetic and Back: Synthesizing Training Data for
Multi-Person Scene Understanding
- Authors: Igor Kviatkovsky, Nadav Bhonker and Gerard Medioni
- Abstract summary: We present a method for synthesizing naturally looking images of multiple people interacting in a specific scenario.
These images benefit from the advantages of synthetic data: being fully controllable and fully annotated with any type of standard or custom-defined ground truth.
To reduce the synthetic-to-real domain gap, we introduce a pipeline consisting of the following steps.
- Score: 0.7519872646378835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method for synthesizing naturally looking images of multiple
people interacting in a specific scenario. These images benefit from the
advantages of synthetic data: being fully controllable and fully annotated with
any type of standard or custom-defined ground truth. To reduce the
synthetic-to-real domain gap, we introduce a pipeline consisting of the
following steps: 1) we render scenes in a context modeled after the real world,
2) we train a human parsing model on the synthetic images, 3) we use the model
to estimate segmentation maps for real images, 4) we train a conditional
generative adversarial network (cGAN) to learn the inverse mapping -- from a
segmentation map to a real image, and 5) given new synthetic segmentation maps,
we use the cGAN to generate realistic images. An illustration of our pipeline
is presented in Figure 2. We use the generated data to train a multi-task model
on the challenging tasks of UV mapping and dense depth estimation. We
demonstrate the value of the data generation and the trained model, both
quantitatively and qualitatively on the CMU Panoptic Dataset.
Related papers
- The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better [39.57368843211441]
Every synthetic image ultimately originates from the upstream data used to train the generator.
We compare finetuning on task-relevant, targeted synthetic data generated by Stable Diffusion against finetuning on targeted real images retrieved directly from LAION-2B.
Our analysis suggests that this underperformance is partially due to generator artifacts and inaccurate task-relevant visual details in the synthetic images.
arXiv Detail & Related papers (2024-06-07T18:04:21Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Joint one-sided synthetic unpaired image translation and segmentation
for colorectal cancer prevention [16.356954231068077]
We produce realistic synthetic images using a combination of 3D technologies and generative adversarial networks.
We propose CUT-seg, a joint training where a segmentation model and a generative model are jointly trained to produce realistic images.
As a part of this study we release Synth-Colon, an entirely synthetic dataset that includes 20000 realistic colon images.
arXiv Detail & Related papers (2023-07-20T22:09:04Z) - Image Captions are Natural Prompts for Text-to-Image Models [70.30915140413383]
We analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts.
We propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data.
Our method significantly improves the performance of models trained on synthetic training data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - Unsupervised Traffic Scene Generation with Synthetic 3D Scene Graphs [83.9783063609389]
We propose a method based on domain-invariant scene representation to directly synthesize traffic scene imagery without rendering.
Specifically, we rely on synthetic scene graphs as our internal representation and introduce an unsupervised neural network architecture for realistic traffic scene synthesis.
arXiv Detail & Related papers (2023-03-15T09:26:29Z) - Synthetic Image Data for Deep Learning [0.294944680995069]
Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models.
We show how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle.
arXiv Detail & Related papers (2022-12-12T20:28:13Z) - A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
arXiv Detail & Related papers (2021-12-09T18:59:21Z) - Fake It Till You Make It: Face analysis in the wild using synthetic data
alone [9.081019005437309]
We show that it is possible to perform face-related computer vision in the wild using synthetic data alone.
We describe how to combine a procedurally-generated 3D face model with a comprehensive library of hand-crafted assets to render training images with unprecedented realism.
arXiv Detail & Related papers (2021-09-30T13:07:04Z) - Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision
Farming [3.4788711710826083]
We propose an alternative solution with respect to the common data augmentation methods, applying it to the problem of crop/weed segmentation in precision farming.
We create semi-artificial samples by replacing the most relevant object classes (i.e., crop and weeds) with their synthesized counterparts.
In addition to RGB data, we take into account also near-infrared (NIR) information, generating four channel multi-spectral synthetic images.
arXiv Detail & Related papers (2020-09-12T08:49:36Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.