SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model
- URL: http://arxiv.org/abs/2503.13952v1
- Date: Tue, 18 Mar 2025 06:41:02 GMT
- Title: SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model
- Authors: Xinqing Li, Ruiqi Song, Qingyu Xie, Ye Wu, Nanxin Zeng, Yunfeng Ai,
- Abstract summary: This paper proposes a simulator-conditioned scene generation engine based on world model.<n>By constructing a simulation system consistent with real-world scenes, simulation data and labels, which serve as the conditions for data generation in the world model, for any scenes can be collected.<n>Results show that these generated images significantly improve downstream perception models performance.
- Score: 1.3700170633913733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid advancement of autonomous driving technology, a lack of data has become a major obstacle to enhancing perception model accuracy. Researchers are now exploring controllable data generation using world models to diversify datasets. However, previous work has been limited to studying image generation quality on specific public datasets. There is still relatively little research on how to build data generation engines for real-world application scenes to achieve large-scale data generation for challenging scenes. In this paper, a simulator-conditioned scene generation engine based on world model is proposed. By constructing a simulation system consistent with real-world scenes, simulation data and labels, which serve as the conditions for data generation in the world model, for any scenes can be collected. It is a novel data generation pipeline by combining the powerful scene simulation capabilities of the simulation engine with the robust data generation capabilities of the world model. In addition, a benchmark with proportionally constructed virtual and real data, is provided for exploring the capabilities of world models in real-world scenes. Quantitative results show that these generated images significantly improve downstream perception models performance. Finally, we explored the generative performance of the world model in urban autonomous driving scenarios. All the data and code will be available at https://github.com/Li-Zn-H/SimWorld.
Related papers
- Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control [97.98560001760126]
We introduce Cosmos-Transfer, a conditional world generation model that can generate world simulations based on multiple spatial control inputs.
We conduct evaluations to analyze the proposed model and demonstrate its applications for Physical AI, including robotics2Real and autonomous vehicle data enrichment.
arXiv Detail & Related papers (2025-03-18T17:57:54Z) - DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation [54.02069690134526]
We propose DrivingSphere, a realistic and closed-loop simulation framework.
Its core idea is to build 4D world representation and generate real-life and controllable driving scenarios.
By providing a dynamic and realistic simulation environment, DrivingSphere enables comprehensive testing and validation of autonomous driving algorithms.
arXiv Detail & Related papers (2024-11-18T03:00:33Z) - SimGen: Simulator-conditioned Driving Scene Generation [50.03358485083602]
We introduce a simulator-conditioned scene generation framework called SimGen.<n>SimGen learns to generate diverse driving scenes by mixing data from the simulator and the real world.<n>It achieves superior generation quality and diversity while preserving controllability based on the text prompt and the layout pulled from a simulator.
arXiv Detail & Related papers (2024-06-13T17:58:32Z) - URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images [39.0780707100513]
We present an integrated end-to-end pipeline that generates simulation scenes complete with articulated kinematic and dynamic structures from real-world images.
In doing so, our work provides both a pipeline for large-scale generation of simulation environments and an integrated system for training robust robotic control policies.
arXiv Detail & Related papers (2024-05-19T20:01:29Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Learning from synthetic data generated with GRADE [0.6982738885923204]
We present a framework for generating realistic animated dynamic environments (GRADE) for robotics research.
GRADE supports full simulation control, ROS integration, realistic physics, while being in an engine that produces high visual fidelity images and ground truth data.
We show that, even training using only synthetic data, can generalize well to real-world images in the same application domain.
arXiv Detail & Related papers (2023-05-07T14:13:04Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Hands-Up: Leveraging Synthetic Data for Hands-On-Wheel Detection [0.38233569758620045]
This work demonstrates the use of synthetic photo-realistic in-cabin data to train a Driver Monitoring System.
We show how performing error analysis and generating the missing edge-cases in our platform boosts performance.
This showcases the ability of human-centric synthetic data to generalize well to the real world.
arXiv Detail & Related papers (2022-05-31T23:34:12Z) - DriveGAN: Towards a Controllable High-Quality Neural Simulation [147.6822288981004]
We introduce a novel high-quality neural simulator referred to as DriveGAN.
DriveGAN achieves controllability by disentangling different components without supervision.
We train DriveGAN on multiple datasets, including 160 hours of real-world driving data.
arXiv Detail & Related papers (2021-04-30T15:30:05Z) - UnrealROX+: An Improved Tool for Acquiring Synthetic Data from Virtual
3D Environments [14.453602631430508]
We present an improved version of UnrealROX, a tool to generate synthetic data from robotic images.
Un UnrealROX+ includes new features such as generating albedo or a Python API for interacting with the virtual environment from Deep Learning frameworks.
arXiv Detail & Related papers (2021-04-23T18:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.