Multimodal 3D Object Detection on Unseen Domains
- URL: http://arxiv.org/abs/2404.11764v1
- Date: Wed, 17 Apr 2024 21:47:45 GMT
- Title: Multimodal 3D Object Detection on Unseen Domains
- Authors: Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal M. Patel,
- Abstract summary: Domain adaptation approaches assume access to unannotated samples from the test distribution to address this problem.
We propose CLIX$text3D$, a multimodal fusion and supervised contrastive learning framework for 3D object detection.
We show that CLIX$text3D$ yields state-of-the-art domain generalization performance under multiple dataset shifts.
- Score: 37.142470149311904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR datasets for autonomous driving exhibit biases in properties such as point cloud density, range, and object dimensions. As a result, object detection networks trained and evaluated in different environments often experience performance degradation. Domain adaptation approaches assume access to unannotated samples from the test distribution to address this problem. However, in the real world, the exact conditions of deployment and access to samples representative of the test dataset may be unavailable while training. We argue that the more realistic and challenging formulation is to require robustness in performance to unseen target domains. We propose to address this problem in a two-pronged manner. First, we leverage paired LiDAR-image data present in most autonomous driving datasets to perform multimodal object detection. We suggest that working with multimodal features by leveraging both images and LiDAR point clouds for scene understanding tasks results in object detectors more robust to unseen domain shifts. Second, we train a 3D object detector to learn multimodal object features across different distributions and promote feature invariance across these source domains to improve generalizability to unseen target domains. To this end, we propose CLIX$^\text{3D}$, a multimodal fusion and supervised contrastive learning framework for 3D object detection that performs alignment of object features from same-class samples of different domains while pushing the features from different classes apart. We show that CLIX$^\text{3D}$ yields state-of-the-art domain generalization performance under multiple dataset shifts.
Related papers
- One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection [71.78795573911512]
We propose textbfOneDet3D, a universal one-for-all model that addresses 3D detection across different domains.
We propose the domain-aware in scatter and context, guided by a routing mechanism, to address the data interference issue.
The fully sparse structure and anchor-free head further accommodate point clouds with significant scale disparities.
arXiv Detail & Related papers (2024-11-03T14:21:56Z) - Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection [64.08296187555095]
Uni$2$Det is a framework for unified and universal multi-dataset training on 3D detection.
We introduce multi-stage prompting modules for multi-dataset 3D detection.
Results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.
arXiv Detail & Related papers (2024-09-30T17:57:50Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection [21.96072831561483]
This paper proposes a novel Supervised Shape&Scale-perceptive Deformable Attention'' (S$3$-DA) module for monocular 3D object detection.
Benefiting from this, S$3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features.
Experiments on KITTI and Open datasets demonstrate that S$3$-DA significantly improves the detection accuracy.
arXiv Detail & Related papers (2023-09-02T12:36:38Z) - 3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data.
We learn a set of vectors that deform the objects in an adversarial fashion.
We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z) - Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection [19.703181080679176]
3D object detection from point clouds is crucial in safety-critical autonomous driving.
We propose a density-insensitive domain adaption framework to address the density-induced domain gap.
Experimental results on three widely adopted 3D object detection datasets demonstrate that our proposed domain adaption method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2023-04-19T06:33:07Z) - SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from
Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D)
SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage.
Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z) - An Unsupervised Domain Adaptive Approach for Multimodal 2D Object
Detection in Adverse Weather Conditions [5.217255784808035]
We propose an unsupervised domain adaptation framework to bridge the domain gap between source and target domains.
We use a data augmentation scheme that simulates weather distortions to add domain confusion and prevent overfitting on the source data.
Experiments performed on the DENSE dataset show that our method can substantially alleviate the domain gap.
arXiv Detail & Related papers (2022-03-07T18:10:40Z) - Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency [90.71745178767203]
Deep learning-based 3D object detection has achieved unprecedented success with the advent of large-scale autonomous driving datasets.
Existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.
We study a more realistic setting, unsupervised 3D domain adaptive detection, which only utilizes source domain annotations.
arXiv Detail & Related papers (2021-07-23T17:19:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.