SimScale: Learning to Drive via Real-World Simulation at Scale
- URL: http://arxiv.org/abs/2511.23369v1
- Date: Fri, 28 Nov 2025 17:17:38 GMT
- Title: SimScale: Learning to Drive via Real-World Simulation at Scale
- Authors: Haochen Tian, Tianyu Li, Haochen Liu, Jiazhi Yang, Yihang Qiu, Guang Li, Junli Wang, Yinfeng Gao, Zhang Zhang, Liang Wang, Hangjun Ye, Tieniu Tan, Long Chen, Hongyang Li,
- Abstract summary: We introduce a novel and scalable simulation framework capable of synthesizing massive unseen states upon existing driving logs.<n>Our pipeline utilizes advanced neural rendering with a reactive environment to generate high-fidelity multi-view observations.<n>We develop a pseudo-expert trajectory generation mechanism for these newly simulated states to provide action supervision.
- Score: 45.08991279559151
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Achieving fully autonomous driving systems requires learning rational decisions in a wide span of scenarios, including safety-critical and out-of-distribution ones. However, such cases are underrepresented in real-world corpus collected by human experts. To complement for the lack of data diversity, we introduce a novel and scalable simulation framework capable of synthesizing massive unseen states upon existing driving logs. Our pipeline utilizes advanced neural rendering with a reactive environment to generate high-fidelity multi-view observations controlled by the perturbed ego trajectory. Furthermore, we develop a pseudo-expert trajectory generation mechanism for these newly simulated states to provide action supervision. Upon the synthesized data, we find that a simple co-training strategy on both real-world and simulated samples can lead to significant improvements in both robustness and generalization for various planning methods on challenging real-world benchmarks, up to +6.8 EPDMS on navhard and +2.9 on navtest. More importantly, such policy improvement scales smoothly by increasing simulation data only, even without extra real-world data streaming in. We further reveal several crucial findings of such a sim-real learning system, which we term SimScale, including the design of pseudo-experts and the scaling properties for different policy architectures. Our simulation data and code would be released.
Related papers
- D-REX: Differentiable Real-to-Sim-to-Real Engine for Learning Dexterous Grasping [66.22412592525369]
We introduce a real-to-sim-to-real engine that leverages the Gaussian Splat representations to build a differentiable engine.<n>We show that our engine achieves accurate and robust performance in mass identification across various object geometries and mass values.<n>Those optimized mass values facilitate force-aware policy learning, achieving superior and high performance in object grasping.
arXiv Detail & Related papers (2026-03-01T15:32:04Z) - HD-GEN: A High-Performance Software System for Human Mobility Data Generation Based on Patterns of Life [1.9739979974462676]
We introduce a comprehensive software pipeline for calibrating, generating, processing, and visualizing large-scale individual-level human mobility datasets.<n>A data generation engine constructs geographically grounded simulations using OpenStreetMap data.<n>A genetic algorithm-based calibration module fine-tunes simulation parameters to align with real-world mobility characteristics.<n>A data processing suite transforms raw simulation logs into structured formats suitable for downstream applications.
arXiv Detail & Related papers (2026-01-03T16:01:00Z) - Simulation Priors for Data-Efficient Deep Learning [56.525770511247934]
SimPEL is a method that efficiently combines first-principles models with data-driven learning.<n>We evaluate SimPEL on diverse systems, including biological, agricultural, and robotic domains.<n>For decision-making, we demonstrate that SimPEL bridges the sim-to-real gap in model-based reinforcement learning.
arXiv Detail & Related papers (2025-09-06T14:36:41Z) - Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios [3.30184292168618]
We propose a dataset generation pipeline based on the CARLA simulator for 3D object detection on LiDAR point clouds.<n>We are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset.
arXiv Detail & Related papers (2025-02-20T22:27:42Z) - ASID: Active Exploration for System Identification in Robotic Manipulation [32.27299045059514]
We propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy.
We demonstrate the efficacy of this paradigm in identifying articulation, mass, and other physical parameters in several challenging robotic manipulation tasks.
arXiv Detail & Related papers (2024-04-18T16:35:38Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - TrafficBots: Towards World Models for Autonomous Driving Simulation and
Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model.
We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving.
Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z) - BITS: Bi-level Imitation for Traffic Simulation [38.28736985320897]
We take a data-driven approach and propose a method that can learn to generate traffic behaviors from real-world driving logs.
We empirically validate our method, named Bi-level Imitation for Traffic Simulation (BITS), with scenarios from two large-scale driving datasets.
As part of our core contributions, we develop and open source a software tool that unifies data formats across different driving datasets.
arXiv Detail & Related papers (2022-08-26T02:17:54Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.