Source-Free and Image-Only Unsupervised Domain Adaptation for Category
Level Object Pose Estimation
- URL: http://arxiv.org/abs/2401.10848v1
- Date: Fri, 19 Jan 2024 17:48:05 GMT
- Title: Source-Free and Image-Only Unsupervised Domain Adaptation for Category
Level Object Pose Estimation
- Authors: Prakhar Kaushik, Aayush Mishra, Adam Kortylewski, Alan Yuille
- Abstract summary: 3DUDA is a method capable of adapting to a nuisance-ridden target domain without 3D or depth data.
We represent object categories as simple cuboid meshes, and harness a generative model of neural feature activations.
We show that our method simulates fine-tuning on a global pseudo-labeled dataset under mild assumptions.
- Score: 18.011044932979143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of source-free unsupervised category-level pose
estimation from only RGB images to a target domain without any access to source
domain data or 3D annotations during adaptation. Collecting and annotating
real-world 3D data and corresponding images is laborious, expensive, yet
unavoidable process, since even 3D pose domain adaptation methods require 3D
data in the target domain. We introduce 3DUDA, a method capable of adapting to
a nuisance-ridden target domain without 3D or depth data. Our key insight stems
from the observation that specific object subparts remain stable across
out-of-domain (OOD) scenarios, enabling strategic utilization of these
invariant subcomponents for effective model updates. We represent object
categories as simple cuboid meshes, and harness a generative model of neural
feature activations modeled at each mesh vertex learnt using differential
rendering. We focus on individual locally robust mesh vertex features and
iteratively update them based on their proximity to corresponding features in
the target domain even when the global pose is not correct. Our model is then
trained in an EM fashion, alternating between updating the vertex features and
the feature extractor. We show that our method simulates fine-tuning on a
global pseudo-labeled dataset under mild assumptions, which converges to the
target domain asymptotically. Through extensive empirical validation, including
a complex extreme UDA setup which combines real nuisances, synthetic noise, and
occlusion, we demonstrate the potency of our simple approach in addressing the
domain shift challenge and significantly improving pose estimation accuracy.
Related papers
- Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning [21.063779140059157]
Existing 3D object detection suffers from expensive annotation costs and poor transferability to unknown data due to the domain gap.
We propose a novel unsupervised domain adaptation framework for 3D object detection via collaborating ST and AL, dubbed as STAL3D, unleashing the complementary advantages of pseudo labels and feature distribution alignment.
arXiv Detail & Related papers (2024-06-27T17:43:35Z) - CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D
Object Detection [14.063365469339812]
LiDAR-based 3D Object Detection methods often do not generalize well to target domains outside the source (or training) data distribution.
We introduce a novel unsupervised domain adaptation (UDA) method, called CMDA, which leverages visual semantic cues from an image modality.
We also introduce a self-training-based learning strategy, wherein a model is adversarially trained to generate domain-invariant features.
arXiv Detail & Related papers (2024-03-06T14:12:38Z) - 3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data.
We learn a set of vectors that deform the objects in an adversarial fashion.
We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - Unsupervised Domain Adaptation for Monocular 3D Object Detection via
Self-Training [57.25828870799331]
We propose STMono3D, a new self-teaching framework for unsupervised domain adaptation on Mono3D.
We develop a teacher-student paradigm to generate adaptive pseudo labels on the target domain.
STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset.
arXiv Detail & Related papers (2022-04-25T12:23:07Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.