Attention-based Adversarial Appearance Learning of Augmented Pedestrians
- URL: http://arxiv.org/abs/2107.02673v2
- Date: Wed, 22 Nov 2023 15:27:53 GMT
- Title: Attention-based Adversarial Appearance Learning of Augmented Pedestrians
- Authors: Kevin Strauss, Artem Savkin, Federico Tombari
- Abstract summary: We propose a method to synthesize realistic data for the pedestrian recognition task.
Our approach utilizes an attention mechanism driven by an adversarial loss to learn domain discrepancies.
Our experiments confirm that the proposed adaptation method is robust to such discrepancies and reveals both visual realism and semantic consistency.
- Score: 49.25430012369125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthetic data became already an essential component of machine
learning-based perception in the field of autonomous driving. Yet it still
cannot replace real data completely due to the sim2real domain shift. In this
work, we propose a method that leverages the advantages of the augmentation
process and adversarial training to synthesize realistic data for the
pedestrian recognition task. Our approach utilizes an attention mechanism
driven by an adversarial loss to learn domain discrepancies and improve
sim2real adaptation. Our experiments confirm that the proposed adaptation
method is robust to such discrepancies and reveals both visual realism and
semantic consistency. Furthermore, we evaluate our data generation pipeline on
the task of pedestrian recognition and demonstrate that generated data resemble
properties of the real domain.
Related papers
- Transferring disentangled representations: bridging the gap between synthetic and real images [1.0760018917783072]
We investigate the possibility of leveraging synthetic data to learn general-purpose disentangled representations applicable to real data.
We propose a new interpretable intervention-based metric, to measure the quality of factors encoding in the representation.
Our results indicate that some level of disentanglement, transferring a representation from synthetic to real data, is possible and effective.
arXiv Detail & Related papers (2024-09-26T16:25:48Z) - Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap [6.393953433174051]
We propose a novel perspective for addressing the real-to-simulated data gap.
We conduct the first large-scale investigation into the real-to-simulated data gap in an autonomous driving setting.
Our results show notable improvements in model robustness to simulated data, even improving real-world performance in some cases.
arXiv Detail & Related papers (2024-03-24T11:09:41Z) - From Synthetic to Real: Unveiling the Power of Synthetic Data for Video
Person Re-ID [15.81210364737776]
We study a new problem of cross-domain video based person re-identification (Re-ID)
We take the synthetic video dataset as the source domain for training and use the real-world videos for testing.
We are surprised to find that the synthetic data performs even better than the real data in the cross-domain setting.
arXiv Detail & Related papers (2024-02-03T10:19:21Z) - Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2)
The dataset is composed of both real and synthetic videos from seven gesture classes.
We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.