TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments
- URL: http://arxiv.org/abs/2208.07943v1
- Date: Tue, 16 Aug 2022 20:46:08 GMT
- Title: TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments
- Authors: Shubham Dokania, Anbumani Subramanian, Manmohan Chandraker, C. V.
Jawahar
- Abstract summary: This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
- Score: 84.6017003787244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-quality structured data with rich annotations are critical components in
intelligent vehicle systems dealing with road scenes. However, data curation
and annotation require intensive investments and yield low-diversity scenarios.
The recently growing interest in synthetic data raises questions about the
scope of improvement in such systems and the amount of manual work still
required to produce high volumes and variations of simulated data. This work
proposes a synthetic data generation pipeline that utilizes existing datasets,
like nuScenes, to address the difficulties and domain-gaps present in simulated
datasets. We show that using annotations and visual cues from existing
datasets, we can facilitate automated multi-modal data generation, mimicking
real scene properties with high-fidelity, along with mechanisms to diversify
samples in a physically meaningful way. We demonstrate improvements in mIoU
metrics by presenting qualitative and quantitative experiments with real and
synthetic data for semantic segmentation on the Cityscapes and KITTI-STEP
datasets. All relevant code and data is released on github
(https://github.com/shubham1810/trove_toolkit).
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models.
ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task.
This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z) - Exploring the Potential of AI-Generated Synthetic Datasets: A Case Study
on Telematics Data with ChatGPT [0.0]
This research delves into the construction and utilization of synthetic datasets, specifically within the telematics sphere, leveraging OpenAI's powerful language model, ChatGPT.
To illustrate this data creation process, a hands-on case study is conducted, focusing on the generation of a synthetic telematics dataset.
arXiv Detail & Related papers (2023-06-23T15:15:13Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - A New Benchmark: On the Utility of Synthetic Data with Blender for Bare
Supervised Learning and Downstream Domain Adaptation [42.2398858786125]
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data.
The uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist.
To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization.
arXiv Detail & Related papers (2023-03-16T09:03:52Z) - Scalable Modular Synthetic Data Generation for Advancing Aerial Autonomy [2.9005223064604078]
We introduce a scalable Aerial Synthetic Data Augmentation (ASDA) framework tailored to aerial autonomy applications.
ASDA extends a central data collection engine with two scriptable pipelines that automatically perform scene and data augmentations.
We demonstrate the effectiveness of our method in automatically generating diverse datasets.
arXiv Detail & Related papers (2022-11-10T04:37:41Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Virtual passengers for real car solutions: synthetic datasets [2.1028463367241033]
We build a 3D scenario and set-up to resemble reality as closely as possible.
It is possible to configure and vary parameters to add randomness to the scene.
We present the process and concept of synthetic data generation in an automotive context.
arXiv Detail & Related papers (2022-05-13T10:54:39Z) - Semi-synthesis: A fast way to produce effective datasets for stereo
matching [16.602343511350252]
Close-to-real-scene texture rendering is a key factor to boost up stereo matching performance.
We propose semi-synthetic, an effective and fast way to synthesize large amount of data with close-to-real-scene texture.
With further fine-tuning on the real dataset, we also achieve SOTA performance on Middlebury and competitive results on KITTI and ETH3D datasets.
arXiv Detail & Related papers (2021-01-26T14:34:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.