DI-V2X: Learning Domain-Invariant Representation for
Vehicle-Infrastructure Collaborative 3D Object Detection
- URL: http://arxiv.org/abs/2312.15742v1
- Date: Mon, 25 Dec 2023 14:40:46 GMT
- Title: DI-V2X: Learning Domain-Invariant Representation for
Vehicle-Infrastructure Collaborative 3D Object Detection
- Authors: Li Xiang and Junbo Yin and Wei Li and Cheng-Zhong Xu and Ruigang Yang
and Jianbing Shen
- Abstract summary: DI-V2X aims to learn Domain-Invariant representations through a new distillation framework.
DI-V2X comprises three essential components: a domain-mixing instance augmentation (DMA) module, a progressive domain-invariant distillation (PDD) module, and a domain-adaptive fusion (DAF) module.
- Score: 78.09431523221458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vehicle-to-Everything (V2X) collaborative perception has recently gained
significant attention due to its capability to enhance scene understanding by
integrating information from various agents, e.g., vehicles, and
infrastructure. However, current works often treat the information from each
agent equally, ignoring the inherent domain gap caused by the utilization of
different LiDAR sensors of each agent, thus leading to suboptimal performance.
In this paper, we propose DI-V2X, that aims to learn Domain-Invariant
representations through a new distillation framework to mitigate the domain
discrepancy in the context of V2X 3D object detection. DI-V2X comprises three
essential components: a domain-mixing instance augmentation (DMA) module, a
progressive domain-invariant distillation (PDD) module, and a domain-adaptive
fusion (DAF) module. Specifically, DMA builds a domain-mixing 3D instance bank
for the teacher and student models during training, resulting in aligned data
representation. Next, PDD encourages the student models from different domains
to gradually learn a domain-invariant feature representation towards the
teacher, where the overlapping regions between agents are employed as guidance
to facilitate the distillation process. Furthermore, DAF closes the domain gap
between the students by incorporating calibration-aware domain-adaptive
attention. Extensive experiments on the challenging DAIR-V2X and V2XSet
benchmark datasets demonstrate DI-V2X achieves remarkable performance,
outperforming all the previous V2X models. Code is available at
https://github.com/Serenos/DI-V2X
Related papers
- LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training [61.26381389532653]
LiOn-XA is an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation.
Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-21T09:50:17Z) - Multimodal 3D Object Detection on Unseen Domains [37.142470149311904]
Domain adaptation approaches assume access to unannotated samples from the test distribution to address this problem.
We propose CLIX$text3D$, a multimodal fusion and supervised contrastive learning framework for 3D object detection.
We show that CLIX$text3D$ yields state-of-the-art domain generalization performance under multiple dataset shifts.
arXiv Detail & Related papers (2024-04-17T21:47:45Z) - Compositional Semantic Mix for Domain Adaptation in Point Cloud
Segmentation [65.78246406460305]
compositional semantic mixing represents the first unsupervised domain adaptation technique for point cloud segmentation.
We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world)
arXiv Detail & Related papers (2023-08-28T14:43:36Z) - SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from
Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D)
SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage.
Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z) - CL3D: Unsupervised Domain Adaptation for Cross-LiDAR 3D Detection [16.021932740447966]
Domain adaptation for Cross-LiDAR 3D detection is challenging due to the large gap on the raw data representation.
We present an unsupervised domain adaptation method that overcomes above difficulties.
arXiv Detail & Related papers (2022-12-01T03:22:55Z) - Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval [55.122020263319634]
Video moment retrieval (VMR) aims to localize the target moment from an untrimmed video according to a given language query.
In this paper, we focus on a novel task: cross-domain VMR, where fully-annotated datasets are available in one domain but the domain of interest only contains unannotated datasets.
We propose a novel Multi-Modal Cross-Domain Alignment network to transfer the annotation knowledge from the source domain to the target domain.
arXiv Detail & Related papers (2022-09-23T12:58:20Z) - An Unsupervised Domain Adaptive Approach for Multimodal 2D Object
Detection in Adverse Weather Conditions [5.217255784808035]
We propose an unsupervised domain adaptation framework to bridge the domain gap between source and target domains.
We use a data augmentation scheme that simulates weather distortions to add domain confusion and prevent overfitting on the source data.
Experiments performed on the DENSE dataset show that our method can substantially alleviate the domain gap.
arXiv Detail & Related papers (2022-03-07T18:10:40Z) - Student Become Decathlon Master in Retinal Vessel Segmentation via
Dual-teacher Multi-target Domain Adaptation [1.121358474059223]
We propose RVms, a novel unsupervised multi-target domain adaptation approach to segment retinal vessels (RVs) from multimodal and multicenter retinal images.
RVms is found to be very close to the target-trained Oracle in terms of segmenting the RVs, largely outperforming other state-of-the-art methods.
arXiv Detail & Related papers (2022-03-07T02:20:14Z) - Adaptive Hierarchical Dual Consistency for Semi-Supervised Left Atrium
Segmentation on Cross-Domain Data [8.645556125521246]
Generalising semi-supervised learning to cross-domain data is of high importance to improve model robustness.
The AHDC consists of a Bidirectional Adversarial Inference module (BAI) and a Hierarchical Dual Consistency learning module (HDC)
We demonstrate the performance of our proposed AHDC on four 3D late gadolinium enhancement cardiac MR (LGE-CMR) datasets from different centres and a 3D CT dataset.
arXiv Detail & Related papers (2021-09-17T02:15:10Z) - Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation [56.94873619509414]
Conventional unsupervised domain adaptation studies the knowledge transfer between a limited number of domains.
We propose a novel Domain2Vec model to provide vectorial representations of visual domains based on joint learning of feature disentanglement and Gram matrix.
We demonstrate that our embedding is capable of predicting domain similarities that match our intuition about visual relations between different domains.
arXiv Detail & Related papers (2020-07-17T22:05:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.