Localising In Complex Scenes Using Balanced Adversarial Adaptation
- URL: http://arxiv.org/abs/2011.04122v1
- Date: Mon, 9 Nov 2020 00:40:50 GMT
- Title: Localising In Complex Scenes Using Balanced Adversarial Adaptation
- Authors: Gil Avraham, Yan Zuo and Tom Drummond
- Abstract summary: Domain adaptation and generative modelling have collectively mitigated the expensive nature of data collection and labelling.
We study the performance gap that exists between representations optimised for localisation on simulation environments and the application of such representations in a real-world setting.
- Score: 19.160686658569507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Domain adaptation and generative modelling have collectively mitigated the
expensive nature of data collection and labelling by leveraging the rich
abundance of accurate, labelled data in simulation environments. In this work,
we study the performance gap that exists between representations optimised for
localisation on simulation environments and the application of such
representations in a real-world setting. Our method exploits the shared
geometric similarities between simulation and real-world environments whilst
maintaining invariance towards visual discrepancies. This is achieved by
optimising a representation extractor to project both simulated and real
representations into a shared representation space. Our method uses a
symmetrical adversarial approach which encourages the representation extractor
to conceal the domain that features are extracted from and simultaneously
preserves robust attributes between source and target domains that are
beneficial for localisation. We evaluate our method by adapting representations
optimised for indoor Habitat simulated environments (Matterport3D and Replica)
to a real-world indoor environment (Active Vision Dataset), showing that it
compares favourably against fully-supervised approaches.
Related papers
- Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Reconstructing Spatiotemporal Data with C-VAEs [49.1574468325115]
Conditional continuous representation of moving regions is commonly used.
In this work, we explore the capabilities of Conditional Varitemporal Autoencoder (C-VAE) models to generate realistic representations of regions' evolution.
arXiv Detail & Related papers (2023-07-12T15:34:10Z) - HaDR: Applying Domain Randomization for Generating Synthetic Multimodal
Dataset for Hand Instance Segmentation in Cluttered Industrial Environments [0.0]
This study uses domain randomization to generate a synthetic RGB-D dataset for training multimodal instance segmentation models.
We show that our approach enables the models to outperform corresponding models trained on existing state-of-the-art datasets.
arXiv Detail & Related papers (2023-04-12T13:02:08Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - Unsupervised Contrastive Domain Adaptation for Semantic Segmentation [75.37470873764855]
We introduce contrastive learning for feature alignment in cross-domain adaptation.
The proposed approach consistently outperforms state-of-the-art methods for domain adaptation.
It achieves 60.2% mIoU on the Cityscapes dataset.
arXiv Detail & Related papers (2022-04-18T16:50:46Z) - Image Synthesis via Semantic Composition [74.68191130898805]
We present a novel approach to synthesize realistic images based on their semantic layouts.
It hypothesizes that for objects with similar appearance, they share similar representation.
Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations.
arXiv Detail & Related papers (2021-09-15T02:26:07Z) - DIRL: Domain-Invariant Representation Learning for Sim-to-Real Transfer [2.119586259941664]
We present a domain-invariant representation learning (DIRL) algorithm to adapt deep models to the physical environment with a small amount of real data.
Experiments on digit domains yield state-of-the-art performance on challenging benchmarks.
arXiv Detail & Related papers (2020-11-15T17:39:01Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - Domain-invariant Similarity Activation Map Contrastive Learning for
Retrieval-based Long-term Visual Localization [30.203072945001136]
In this work, a general architecture is first formulated probabilistically to extract domain invariant feature through multi-domain image translation.
And then a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy.
Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMUSeasons dataset.
Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision.
arXiv Detail & Related papers (2020-09-16T14:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.