CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D
Object Detection
- URL: http://arxiv.org/abs/2403.03721v2
- Date: Thu, 7 Mar 2024 02:20:27 GMT
- Title: CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D
Object Detection
- Authors: Gyusam Chang, Wonseok Roh, Sujin Jang, Dongwook Lee, Daehyun Ji,
Gyeongrok Oh, Jinsun Park, Jinkyu Kim, Sangpil Kim
- Abstract summary: LiDAR-based 3D Object Detection methods often do not generalize well to target domains outside the source (or training) data distribution.
We introduce a novel unsupervised domain adaptation (UDA) method, called CMDA, which leverages visual semantic cues from an image modality.
We also introduce a self-training-based learning strategy, wherein a model is adversarially trained to generate domain-invariant features.
- Score: 14.063365469339812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent LiDAR-based 3D Object Detection (3DOD) methods show promising results,
but they often do not generalize well to target domains outside the source (or
training) data distribution. To reduce such domain gaps and thus to make 3DOD
models more generalizable, we introduce a novel unsupervised domain adaptation
(UDA) method, called CMDA, which (i) leverages visual semantic cues from an
image modality (i.e., camera images) as an effective semantic bridge to close
the domain gap in the cross-modal Bird's Eye View (BEV) representations.
Further, (ii) we also introduce a self-training-based learning strategy,
wherein a model is adversarially trained to generate domain-invariant features,
which disrupt the discrimination of whether a feature instance comes from a
source or an unseen target domain. Overall, our CMDA framework guides the 3DOD
model to generate highly informative and domain-adaptive features for novel
data distributions. In our extensive experiments with large-scale benchmarks,
such as nuScenes, Waymo, and KITTI, those mentioned above provide significant
performance gains for UDA tasks, achieving state-of-the-art performance.
Related papers
- LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training [61.26381389532653]
LiOn-XA is an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation.
Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-21T09:50:17Z) - STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning [21.063779140059157]
Existing 3D object detection suffers from expensive annotation costs and poor transferability to unknown data due to the domain gap.
We propose a novel unsupervised domain adaptation framework for 3D object detection via collaborating ST and AL, dubbed as STAL3D, unleashing the complementary advantages of pseudo labels and feature distribution alignment.
arXiv Detail & Related papers (2024-06-27T17:43:35Z) - Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation [17.875516787157018]
We study how to harness the knowledge priors learned by 2D visual foundation models to produce more accurate labels for unlabeled target domains.
Our method is evaluated on various autonomous driving datasets and the results demonstrate a significant improvement for 3D segmentation task.
arXiv Detail & Related papers (2024-03-15T03:58:17Z) - Domain Generalization of 3D Object Detection by Density-Resampling [14.510085711178217]
Point-cloud-based 3D object detection suffers from performance degradation when encountering data with novel domain gaps.
We propose an SDG method to improve the generalizability of 3D object detection to unseen target domains.
Our work introduces a novel data augmentation method and contributes a new multi-task learning strategy in the methodology.
arXiv Detail & Related papers (2023-11-17T20:01:29Z) - BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain
Generalization of 3D Semantic Segmentation [59.99683295806698]
Cross-modal Unsupervised Domain Adaptation (UDA) aims to exploit the complementarity of 2D-3D data to overcome the lack of annotation in a new domain.
We propose cross-modal learning under bird's-eye view for Domain Generalization (DG) of 3D semantic segmentation, called BEV-DG.
arXiv Detail & Related papers (2023-08-12T11:09:17Z) - Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data.
Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z) - SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from
Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D)
SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage.
Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z) - Unsupervised Domain Adaptation for Monocular 3D Object Detection via
Self-Training [57.25828870799331]
We propose STMono3D, a new self-teaching framework for unsupervised domain adaptation on Mono3D.
We develop a teacher-student paradigm to generate adaptive pseudo labels on the target domain.
STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset.
arXiv Detail & Related papers (2022-04-25T12:23:07Z) - Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency [90.71745178767203]
Deep learning-based 3D object detection has achieved unprecedented success with the advent of large-scale autonomous driving datasets.
Existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.
We study a more realistic setting, unsupervised 3D domain adaptive detection, which only utilizes source domain annotations.
arXiv Detail & Related papers (2021-07-23T17:19:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.