SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction
- URL: http://arxiv.org/abs/2004.02022v3
- Date: Fri, 17 Jul 2020 20:43:29 GMT
- Title: SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction
- Authors: Junwei Liang, Lu Jiang, Alexander Hauptmann
- Abstract summary: We propose a novel approach to learn robust representation through augmenting the simulation training data.
We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
- Score: 78.91518036949918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the problem of predicting future trajectories of people in
unseen cameras of novel scenarios and views. We approach this problem through
the real-data-free setting in which the model is trained only on 3D simulation
data and applied out-of-the-box to a wide variety of real cameras. We propose a
novel approach to learn robust representation through augmenting the simulation
training data such that the representation can better generalize to unseen
real-world test data. The key idea is to mix the feature of the hardest camera
view with the adversarial feature of the original view. We refer to our method
as SimAug. We show that SimAug achieves promising results on three real-world
benchmarks using zero real training data, and state-of-the-art performance in
the Stanford Drone and the VIRAT/ActEV dataset when using in-domain training
data.
Related papers
- Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection [9.708971995966476]
This paper introduces a two-stage training strategy to address these challenges.
Our approach initially trains a model on the large-scale synthetic dataset, RoadSense3D.
We fine-tune the model on a combination of real-world datasets to enhance its adaptability to practical conditions.
arXiv Detail & Related papers (2024-08-28T08:44:58Z) - Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap [6.393953433174051]
We propose a novel perspective for addressing the real-to-simulated data gap.
We conduct the first large-scale investigation into the real-to-simulated data gap in an autonomous driving setting.
Our results show notable improvements in model robustness to simulated data, even improving real-world performance in some cases.
arXiv Detail & Related papers (2024-03-24T11:09:41Z) - Reconstructing Objects in-the-wild for Realistic Sensor Simulation [41.55571880832957]
We present NeuSim, a novel approach that estimates accurate geometry and realistic appearance from sparse in-the-wild data.
We model the object appearance with a robust physics-inspired reflectance representation effective for in-the-wild data.
Our experiments show that NeuSim has strong view synthesis performance on challenging scenarios with sparse training views.
arXiv Detail & Related papers (2023-11-09T18:58:22Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Sim2real Transfer Learning for Point Cloud Segmentation: An Industrial
Application Case on Autonomous Disassembly [55.41644538483948]
We present an industrial application case that uses sim2real transfer learning for point cloud data.
We provide insights on how to generate and process synthetic point cloud data.
A novel patch-based attention network is proposed additionally to tackle this problem.
arXiv Detail & Related papers (2023-01-12T14:00:37Z) - RARA: Zero-shot Sim2Real Visual Navigation with Following Foreground
Cues [42.998649025215045]
We tackle the specific case of camera-based navigation, formulating it as following a visual cue in the foreground with arbitrary backgrounds.
The goal is to train a visual agent on data captured in an empty simulated environment except for this foreground cue and test this model directly in a visually diverse real world.
arXiv Detail & Related papers (2022-01-08T09:53:21Z) - VISTA 2.0: An Open, Data-driven Simulator for Multimodal Sensing and
Policy Learning for Autonomous Vehicles [131.2240621036954]
We present VISTA, an open source, data-driven simulator that integrates multiple types of sensors for autonomous vehicles.
Using high fidelity, real-world datasets, VISTA represents and simulates RGB cameras, 3D LiDAR, and event-based cameras.
We demonstrate the ability to train and test perception-to-control policies across each of the sensor types and showcase the power of this approach via deployment on a full scale autonomous vehicle.
arXiv Detail & Related papers (2021-11-23T18:58:10Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - Exploring the Capabilities and Limits of 3D Monocular Object Detection
-- A Study on Simulation and Real World Data [0.0]
3D object detection based on monocular camera data is key enabler for autonomous driving.
Recent deep learning methods show promising results to recover depth information from single images.
In this paper, we evaluate the performance of a 3D object detection pipeline which is parameterizable with different depth estimation configurations.
arXiv Detail & Related papers (2020-05-15T09:05:17Z) - Transferable Active Grasping and Real Embodied Dataset [48.887567134129306]
We show how to search for feasible viewpoints for grasping by the use of hand-mounted RGB-D cameras.
A practical 3-stage transferable active grasping pipeline is developed, that is adaptive to unseen clutter scenes.
In our pipeline, we propose a novel mask-guided reward to overcome the sparse reward issue in grasping and ensure category-irrelevant behavior.
arXiv Detail & Related papers (2020-04-28T08:15:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.