From Gaming to Research: GTA V for Synthetic Data Generation for Robotics and Navigations
- URL: http://arxiv.org/abs/2502.12303v1
- Date: Mon, 17 Feb 2025 20:22:52 GMT
- Title: From Gaming to Research: GTA V for Synthetic Data Generation for Robotics and Navigations
- Authors: Matteo Scucchia, Matteo Ferrara, Davide Maltoni,
- Abstract summary: We introduce a synthetic dataset created using the virtual environment of the video game Grand Theft Auto V (GTA V)
We demonstrate that synthetic data derived from GTA V are qualitatively comparable to real-world data.
- Score: 2.7383830691749163
- License:
- Abstract: In computer vision, the development of robust algorithms capable of generalizing effectively in real-world scenarios more and more often requires large-scale datasets collected under diverse environmental conditions. However, acquiring such datasets is time-consuming, costly, and sometimes unfeasible. To address these limitations, the use of synthetic data has gained attention as a viable alternative, allowing researchers to generate vast amounts of data while simulating various environmental contexts in a controlled setting. In this study, we investigate the use of synthetic data in robotics and navigation, specifically focusing on Simultaneous Localization and Mapping (SLAM) and Visual Place Recognition (VPR). In particular, we introduce a synthetic dataset created using the virtual environment of the video game Grand Theft Auto V (GTA V), along with an algorithm designed to generate a VPR dataset, without human supervision. Through a series of experiments centered on SLAM and VPR, we demonstrate that synthetic data derived from GTA V are qualitatively comparable to real-world data. Furthermore, these synthetic data can complement or even substitute real-world data in these applications. This study sets the stage for the creation of large-scale synthetic datasets, offering a cost-effective and scalable solution for future research and development.
Related papers
- Synthetic data augmentation for robotic mobility aids to support blind and low vision people [5.024531194389658]
Robotic mobility aids for blind and low-vision (BLV) individuals rely heavily on deep learning-based vision models.
The performance of these models is often constrained by the availability and diversity of real-world datasets.
In this study, we investigate the effectiveness of synthetic data, generated using Unreal Engine 4, for training robust vision models.
arXiv Detail & Related papers (2024-09-17T13:17:28Z) - Best Practices and Lessons Learned on Synthetic Data [83.63271573197026]
The success of AI models relies on the availability of large, diverse, and high-quality datasets.
Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns.
arXiv Detail & Related papers (2024-04-11T06:34:17Z) - RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications [55.24463002889]
We focus on depth data synthesis and develop a range-aware RGB-D data simulation pipeline (RaSim)
In particular, high-fidelity depth data is generated by imitating the imaging principle of real-world sensors.
RaSim can be directly applied to real-world scenarios without any finetuning and excel at downstream RGB-D perception tasks.
arXiv Detail & Related papers (2024-04-05T08:52:32Z) - Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models.
ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task.
This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z) - Exploring the Potential of AI-Generated Synthetic Datasets: A Case Study
on Telematics Data with ChatGPT [0.0]
This research delves into the construction and utilization of synthetic datasets, specifically within the telematics sphere, leveraging OpenAI's powerful language model, ChatGPT.
To illustrate this data creation process, a hands-on case study is conducted, focusing on the generation of a synthetic telematics dataset.
arXiv Detail & Related papers (2023-06-23T15:15:13Z) - TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z) - DATED: Guidelines for Creating Synthetic Datasets for Engineering Design
Applications [3.463438487417909]
This study proposes comprehensive guidelines for generating, annotating, and validating synthetic datasets.
The study underscores the importance of thoughtful sampling methods to ensure the appropriate size, diversity, utility, and realism of a dataset.
Overall, this paper offers valuable insights for researchers intending to create and publish synthetic datasets for engineering design.
arXiv Detail & Related papers (2023-05-15T21:00:09Z) - Beyond Privacy: Navigating the Opportunities and Challenges of Synthetic
Data [91.52783572568214]
Synthetic data may become a dominant force in the machine learning world, promising a future where datasets can be tailored to individual needs.
We discuss which fundamental challenges the community needs to overcome for wider relevance and application of synthetic data.
arXiv Detail & Related papers (2023-04-07T16:38:40Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer
Vision [3.5694949627557846]
We release a human-centric synthetic data generator PeopleSansPeople.
It contains simulation-ready 3D human assets, a parameterized lighting and camera system, and generates 2D and 3D bounding box, instance and semantic segmentation, and COCO pose labels.
arXiv Detail & Related papers (2021-12-17T02:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.