Sim2Real for Self-Supervised Monocular Depth and Segmentation
- URL: http://arxiv.org/abs/2012.00238v1
- Date: Tue, 1 Dec 2020 03:25:02 GMT
- Title: Sim2Real for Self-Supervised Monocular Depth and Segmentation
- Authors: Nithin Raghavan, Punarjay Chakravarty, Shubham Shrivastava
- Abstract summary: Image-based learning methods for autonomous vehicle perception tasks require large quantities of labelled, real data in order to properly train without overfitting.
Recent advances in domain adaptation have indicated that a shared latent space assumption can help to bridge the gap between the simulation and real domains.
We demonstrate that a twin VAE-based architecture with a shared latent space and auxiliary decoders is able to bridge the sim2real gap without requiring any paired, ground-truth data in the real domain.
- Score: 7.376636976924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image-based learning methods for autonomous vehicle perception tasks require
large quantities of labelled, real data in order to properly train without
overfitting, which can often be incredibly costly. While leveraging the power
of simulated data can potentially aid in mitigating these costs, networks
trained in the simulation domain usually fail to perform adequately when
applied to images in the real domain. Recent advances in domain adaptation have
indicated that a shared latent space assumption can help to bridge the gap
between the simulation and real domains, allowing the transference of the
predictive capabilities of a network from the simulation domain to the real
domain. We demonstrate that a twin VAE-based architecture with a shared latent
space and auxiliary decoders is able to bridge the sim2real gap without
requiring any paired, ground-truth data in the real domain. Using only paired,
ground-truth data in the simulation domain, this architecture has the potential
to generate perception tasks such as depth and segmentation maps. We compare
this method to networks trained in a supervised manner to indicate the merit of
these results.
Related papers
- Compositional Semantic Mix for Domain Adaptation in Point Cloud
Segmentation [65.78246406460305]
compositional semantic mixing represents the first unsupervised domain adaptation technique for point cloud segmentation.
We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world)
arXiv Detail & Related papers (2023-08-28T14:43:36Z) - Quantifying the LiDAR Sim-to-Real Domain Shift: A Detailed Investigation
Using Object Detectors and Analyzing Point Clouds at Target-Level [1.1999555634662635]
LiDAR object detection algorithms based on neural networks for autonomous driving require large amounts of data for training, validation, and testing.
We show that using simulated data for the training of neural networks leads to a domain shift of training and testing data due to differences in scenes, scenarios, and distributions.
arXiv Detail & Related papers (2023-03-03T12:52:01Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - ActiveZero: Mixed Domain Learning for Active Stereovision with Zero
Annotation [21.33158815473845]
We present a new framework, ActiveZero, which is a mixed domain learning solution for active stereovision systems.
We show how the method can be trained end-to-end and that each module is important for attaining the end result.
arXiv Detail & Related papers (2021-12-06T04:03:47Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - Attention-based Adversarial Appearance Learning of Augmented Pedestrians [49.25430012369125]
We propose a method to synthesize realistic data for the pedestrian recognition task.
Our approach utilizes an attention mechanism driven by an adversarial loss to learn domain discrepancies.
Our experiments confirm that the proposed adaptation method is robust to such discrepancies and reveals both visual realism and semantic consistency.
arXiv Detail & Related papers (2021-07-06T15:27:00Z) - DIRL: Domain-Invariant Representation Learning for Sim-to-Real Transfer [2.119586259941664]
We present a domain-invariant representation learning (DIRL) algorithm to adapt deep models to the physical environment with a small amount of real data.
Experiments on digit domains yield state-of-the-art performance on challenging benchmarks.
arXiv Detail & Related papers (2020-11-15T17:39:01Z) - From Simulation to Real World Maneuver Execution using Deep
Reinforcement Learning [69.23334811890919]
Deep Reinforcement Learning has proved to be able to solve many control tasks in different fields, but the behavior of these systems is not always as expected when deployed in real-world scenarios.
This is mainly due to the lack of domain adaptation between simulated and real-world data together with the absence of distinction between train and test datasets.
We present a system based on multiple environments in which agents are trained simultaneously, evaluating the behavior of the model in different scenarios.
arXiv Detail & Related papers (2020-05-13T14:22:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.