PUG: Photorealistic and Semantically Controllable Synthetic Data for
Representation Learning
- URL: http://arxiv.org/abs/2308.03977v2
- Date: Wed, 13 Dec 2023 01:44:58 GMT
- Title: PUG: Photorealistic and Semantically Controllable Synthetic Data for
Representation Learning
- Authors: Florian Bordes, Shashank Shekhar, Mark Ibrahim, Diane Bouchacourt,
Pascal Vincent, Ari S. Morcos
- Abstract summary: We present a new generation of interactive environments for representation learning research that offer both controllability and realism.
We use the Unreal Engine, a powerful game engine well known in the entertainment industry, to produce PUG environments and datasets for representation learning.
- Score: 31.81199165450692
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthetic image datasets offer unmatched advantages for designing and
evaluating deep neural networks: they make it possible to (i) render as many
data samples as needed, (ii) precisely control each scene and yield granular
ground truth labels (and captions), (iii) precisely control distribution shifts
between training and testing to isolate variables of interest for sound
experimentation. Despite such promise, the use of synthetic image data is still
limited -- and often played down -- mainly due to their lack of realism. Most
works therefore rely on datasets of real images, which have often been scraped
from public images on the internet, and may have issues with regards to
privacy, bias, and copyright, while offering little control over how objects
precisely appear. In this work, we present a path to democratize the use of
photorealistic synthetic data: we develop a new generation of interactive
environments for representation learning research, that offer both
controllability and realism. We use the Unreal Engine, a powerful game engine
well known in the entertainment industry, to produce PUG (Photorealistic Unreal
Graphics) environments and datasets for representation learning. In this paper,
we demonstrate the potential of PUG to enable more rigorous evaluations of
vision models.
Related papers
- FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models [14.596090302381647]
This paper studies photorealism enhancement of rendered images, leveraging generative power from diffusion models on the controlled basis of rendering.
We introduce a novel framework to translate rendered images into their realistic counterparts, which consists of two stages: Domain Knowledge Injection (DKI) and Realistic Image Generation (RIG)
arXiv Detail & Related papers (2024-10-18T12:48:22Z) - Generating Synthetic Satellite Imagery With Deep-Learning Text-to-Image Models -- Technical Challenges and Implications for Monitoring and Verification [46.42328086160106]
We explore how synthetic satellite images can be created using conditioning mechanisms.
We evaluate the results based on authenticity and state-of-the-art metrics.
We discuss implications of synthetic satellite imagery in the context of monitoring and verification.
arXiv Detail & Related papers (2024-04-11T14:00:20Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs [59.12526668734703]
We introduce Composable Object Volume NeRF (COV-NeRF), an object-composable NeRF model that is the centerpiece of a real-to-sim pipeline.
COV-NeRF extracts objects from real images and composes them into new scenes, generating photorealistic renderings and many types of 2D and 3D supervision.
arXiv Detail & Related papers (2024-03-07T00:00:02Z) - Improving the Effectiveness of Deep Generative Data [5.856292656853396]
Training a model on purely synthetic images for downstream image processing tasks results in an undesired performance drop compared to training on real data.
We propose a new taxonomy to describe factors contributing to this commonly observed phenomenon and investigate it on the popular CIFAR-10 dataset.
Our method outperforms baselines on downstream classification tasks both in case of training on synthetic only (Synthetic-to-Real) and training on a mix of real and synthetic data.
arXiv Detail & Related papers (2023-11-07T12:57:58Z) - Learning from synthetic data generated with GRADE [0.6982738885923204]
We present a framework for generating realistic animated dynamic environments (GRADE) for robotics research.
GRADE supports full simulation control, ROS integration, realistic physics, while being in an engine that produces high visual fidelity images and ground truth data.
We show that, even training using only synthetic data, can generalize well to real-world images in the same application domain.
arXiv Detail & Related papers (2023-05-07T14:13:04Z) - Synthetic Data for Object Classification in Industrial Applications [53.180678723280145]
In object classification, capturing a large number of images per object and in different conditions is not always possible.
This work explores the creation of artificial images using a game engine to cope with limited data in the training dataset.
arXiv Detail & Related papers (2022-12-09T11:43:04Z) - ImaginaryNet: Learning Object Detectors without Real Images and
Annotations [66.30908705345973]
We propose a framework to synthesize images by combining pretrained language model and text-to-image model.
With the synthesized images and class labels, weakly supervised object detection can then be leveraged to accomplish Imaginary-Supervised Object Detection.
Experiments show that ImaginaryNet can (i) obtain about 70% performance in ISOD compared with the weakly supervised counterpart of the same backbone trained on real data.
arXiv Detail & Related papers (2022-10-13T10:25:22Z) - PennSyn2Real: Training Object Recognition Models without Human Labeling [12.923677573437699]
We propose PennSyn2Real - a synthetic dataset consisting of more than 100,000 4K images of more than 20 types of micro aerial vehicles (MAVs)
The dataset can be used to generate arbitrary numbers of training images for high-level computer vision tasks such as MAV detection and classification.
We show that synthetic data generated using this framework can be directly used to train CNN models for common object recognition tasks such as detection and segmentation.
arXiv Detail & Related papers (2020-09-22T02:53:40Z) - OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene
Datasets [103.54691385842314]
We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes.
Our goal is to make the dataset creation process widely accessible.
This enables important applications in inverse rendering, scene understanding and robotics.
arXiv Detail & Related papers (2020-07-25T06:48:47Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.