Related papers: Adversarial Appearance Learning in Augmented Cityscapes for Pedestrian Recognition in Autonomous Driving

Adversarial Appearance Learning in Augmented Cityscapes for Pedestrian Recognition in Autonomous Driving

URL: http://arxiv.org/abs/2509.13507v1
Date: Tue, 16 Sep 2025 20:12:33 GMT
Title: Adversarial Appearance Learning in Augmented Cityscapes for Pedestrian Recognition in Autonomous Driving
Authors: Artem Savkin, Thomas Lapotre, Kevin Strauss, Uzair Akbar, Federico Tombari,
Abstract summary: In the autonomous driving area synthetic data is crucial for cover specific traffic scenarios which autonomous vehicle must handle.<n>In this paper we deploy data augmentation to generate custom traffic scenarios with VRUs in order to improve pedestrian recognition.
Score: 39.61652266573024
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the autonomous driving area synthetic data is crucial for cover specific traffic scenarios which autonomous vehicle must handle. This data commonly introduces domain gap between synthetic and real domains. In this paper we deploy data augmentation to generate custom traffic scenarios with VRUs in order to improve pedestrian recognition. We provide a pipeline for augmentation of the Cityscapes dataset with virtual pedestrians. In order to improve augmentation realism of the pipeline we reveal a novel generative network architecture for adversarial learning of the data-set lighting conditions. We also evaluate our approach on the tasks of semantic and instance segmentation.

Related papers

Wireless Traffic Prediction with Large Language Model [54.07581399989292]
TIDES is a novel framework that captures spatial-temporal correlations for wireless traffic prediction.<n> TIDES achieves efficient adaptation to domain-specific patterns without incurring excessive training overhead.<n>Our results indicate that integrating spatial awareness into LLM-based predictors is the key to unlocking scalable and intelligent network management in future 6G systems.
arXiv Detail & Related papers (2025-12-19T04:47:40Z)
Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks [47.07188762367792]
We present ARSim, a framework designed to enhance real multi-view image data with 3D synthetic objects of interest. We construct a simplified virtual scene using real data and strategically place 3D synthetic assets within it. The resulting augmented multi-view consistent dataset is used to train a multi-camera perception network for autonomous vehicles.
arXiv Detail & Related papers (2024-03-22T17:49:11Z)
TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [149.5716746789134]
We show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving. Experiments on the open motion dataset show TrafficBots can simulate realistic multi-agent behaviors.
arXiv Detail & Related papers (2023-03-07T18:28:41Z)
Foresee What You Will Learn: Data Augmentation for Domain Generalization in Non-Stationary Environments [14.344721944207599]
Existing domain generalization aims to learn a generalizable model to perform well even on unseen domains. We propose Directional Domain Augmentation (DDA), which simulates the unseen target features by mapping source data as augmentations through a domain transformer. We evaluate the proposed method on both synthetic datasets and realworld datasets, and empirical results show that our approach can outperform other existing methods.
arXiv Detail & Related papers (2023-01-19T01:51:37Z)
CARNet: A Dynamic Autoencoder for Learning Latent Dynamics in Autonomous Driving Tasks [11.489187712465325]
An autonomous driving system should effectively use the information collected from the various sensors in order to form an abstract description of the world. Deep learning models, such as autoencoders, can be used for that purpose, as they can learn compact latent representations from a stream of incoming data. This work proposes CARNet, a Combined dynAmic autoencodeR NETwork architecture that utilizes an autoencoder combined with a recurrent neural network to learn the current latent representation.
arXiv Detail & Related papers (2022-05-18T04:15:42Z)
Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone. Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator. We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z)
Attention-based Adversarial Appearance Learning of Augmented Pedestrians [49.25430012369125]
We propose a method to synthesize realistic data for the pedestrian recognition task. Our approach utilizes an attention mechanism driven by an adversarial loss to learn domain discrepancies. Our experiments confirm that the proposed adaptation method is robust to such discrepancies and reveals both visual realism and semantic consistency.
arXiv Detail & Related papers (2021-07-06T15:27:00Z)
Improving Generalization of Transfer Learning Across Domains Using Spatio-Temporal Features in Autonomous Driving [45.655433907239804]
Vehicle simulation can be used to learn in the virtual world, and the acquired skills can be transferred to handle real-world scenarios. These visual elements are intuitively crucial for human decision making during driving. We propose a CNN+LSTM transfer learning framework to extract thetemporal-temporal features representing vehicle dynamics from scenes.
arXiv Detail & Related papers (2021-03-15T03:26:06Z)
Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.