S2R-ViT for Multi-Agent Cooperative Perception: Bridging the Gap from
Simulation to Reality
- URL: http://arxiv.org/abs/2307.07935v4
- Date: Tue, 20 Feb 2024 20:50:55 GMT
- Title: S2R-ViT for Multi-Agent Cooperative Perception: Bridging the Gap from
Simulation to Reality
- Authors: Jinlong Li, Runsheng Xu, Xinyu Liu, Baolu Li, Qin Zou, Jiaqi Ma,
Hongkai Yu
- Abstract summary: We propose the first-to-Reality transfer learning framework for multi-agent cooperative perception using a novel Vision Transformer, named as S2R-ViT.
Our experiments on the public multi-agent cooperative perception datasets OPV2V and V2V4Real demonstrate that the proposed S2R-ViT can effectively bridge the gap from simulation to reality.
- Score: 41.25312194294171
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the lack of enough real multi-agent data and time-consuming of
labeling, existing multi-agent cooperative perception algorithms usually select
the simulated sensor data for training and validating. However, the perception
performance is degraded when these simulation-trained models are deployed to
the real world, due to the significant domain gap between the simulated and
real data. In this paper, we propose the first Simulation-to-Reality transfer
learning framework for multi-agent cooperative perception using a novel Vision
Transformer, named as S2R-ViT, which considers both the Deployment Gap and
Feature Gap between simulated and real data. We investigate the effects of
these two types of domain gaps and propose a novel uncertainty-aware vision
transformer to effectively relief the Deployment Gap and an agent-based feature
adaptation module with inter-agent and ego-agent discriminators to reduce the
Feature Gap. Our intensive experiments on the public multi-agent cooperative
perception datasets OPV2V and V2V4Real demonstrate that the proposed S2R-ViT
can effectively bridge the gap from simulation to reality and outperform other
methods significantly for point cloud-based 3D object detection.
Related papers
- Assessing Quality Metrics for Neural Reality Gap Input Mitigation in Autonomous Driving Testing [2.194575078433007]
Simulation-based testing of automated driving systems (ADS) is the industry standard, being a controlled, safe, and cost-effective alternative to real-world testing.
Despite these advantages, virtual simulations often fail to accurately replicate real-world conditions like image fidelity, texture representation, and environmental accuracy.
This can lead to significant differences in ADS behavior between simulated and real-world domains, a phenomenon known as the sim2real gap.
Researchers have used Image-to-Image (I2I) neural translation to mitigate the sim2real gap, enhancing the realism of simulated environments by transforming synthetic data into more authentic
arXiv Detail & Related papers (2024-04-29T10:37:38Z) - Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap [6.393953433174051]
We propose a novel perspective for addressing the real-to-simulated data gap.
We conduct the first large-scale investigation into the real-to-simulated data gap in an autonomous driving setting.
Our results show notable improvements in model robustness to simulated data, even improving real-world performance in some cases.
arXiv Detail & Related papers (2024-03-24T11:09:41Z) - DUSA: Decoupled Unsupervised Sim2Real Adaptation for
Vehicle-to-Everything Collaborative Perception [17.595237664316148]
Vehicle-to-Everything (V2X) collaborative perception is crucial for autonomous driving.
achieving high-precision V2X perception requires a significant amount of annotated real-world data.
We present a new unsupervised sim2real domain adaptation method for V2X collaborative detection named Decoupled Unsupervised Sim2Real Adaptation (DUSA)
arXiv Detail & Related papers (2023-10-12T08:21:17Z) - INTAGS: Interactive Agent-Guided Simulation [4.04638613278729]
In many applications involving multi-agent system (MAS), it is imperative to test an experimental (Exp) autonomous agent in a high-fidelity simulator prior to its deployment to production.
We propose a metric to distinguish between real and synthetic multi-agent systems, which is evaluated through the live interaction between the Exp and BG agents.
We show that using INTAGS to calibrate the simulator can generate more realistic market data compared to the state-of-the-art conditional Wasserstein Generative Adversarial Network approach.
arXiv Detail & Related papers (2023-09-04T19:56:18Z) - A Platform-Agnostic Deep Reinforcement Learning Framework for Effective Sim2Real Transfer towards Autonomous Driving [0.0]
Deep Reinforcement Learning (DRL) has shown remarkable success in solving complex tasks.
transferring DRL agents to the real world is still challenging due to the significant discrepancies between simulation and reality.
We propose a robust DRL framework that leverages platform-dependent perception modules to extract task-relevant information.
arXiv Detail & Related papers (2023-04-14T07:55:07Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.