Move to See Better: Self-Improving Embodied Object Detection
- URL: http://arxiv.org/abs/2012.00057v2
- Date: Mon, 29 Mar 2021 08:09:11 GMT
- Title: Move to See Better: Self-Improving Embodied Object Detection
- Authors: Zhaoyuan Fang, Ayush Jain, Gabriel Sarch, Adam W. Harley, Katerina
Fragkiadaki
- Abstract summary: We propose a method for improving object detection in testing environments.
Our agent collects multi-view data, generates 2D and 3D pseudo-labels, and fine-tunes its detector in a self-supervised manner.
- Score: 35.461141354989714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Passive methods for object detection and segmentation treat images of the
same scene as individual samples and do not exploit object permanence across
multiple views. Generalization to novel or difficult viewpoints thus requires
additional training with lots of annotations. In contrast, humans often
recognize objects by simply moving around, to get more informative viewpoints.
In this paper, we propose a method for improving object detection in testing
environments, assuming nothing but an embodied agent with a pre-trained 2D
object detector. Our agent collects multi-view data, generates 2D and 3D
pseudo-labels, and fine-tunes its detector in a self-supervised manner.
Experiments on both indoor and outdoor datasets show that (1) our method
obtains high-quality 2D and 3D pseudo-labels from multi-view RGB-D data; (2)
fine-tuning with these pseudo-labels improves the 2D detector significantly in
the test environment; (3) training a 3D detector with our pseudo-labels
outperforms a prior self-supervised method by a large margin; (4) given weak
supervision, our method can generate better pseudo-labels for novel objects.
Related papers
- TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection [59.498894868956306]
Pseudo-labeling approaches to semi-supervised learning adopt a teacher-student framework.
We leverage pre-trained motion-forecasting models to generate object trajectories on pseudo-labeled data.
Our approach improves pseudo-label quality in two distinct manners.
arXiv Detail & Related papers (2024-09-17T05:35:00Z) - ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only [5.699475977818167]
3D object detection plays a crucial role in various applications such as autonomous vehicles, robotics and augmented reality.
We propose a weakly supervised 3D annotator that relies solely on 2D bounding box annotations from images, along with size priors.
arXiv Detail & Related papers (2024-07-24T11:58:31Z) - Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object
Detection [55.210991151015534]
We present a novel Dual-Perspective Knowledge Enrichment approach named DPKE for semi-supervised 3D object detection.
Our DPKE enriches the knowledge of limited training data, particularly unlabeled data, from two perspectives: data-perspective and feature-perspective.
arXiv Detail & Related papers (2024-01-10T08:56:07Z) - PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.603858163158625]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection.
We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z) - Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency [78.76508318592552]
Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application.
Most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase.
We propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images.
arXiv Detail & Related papers (2023-03-15T15:14:00Z) - An Empirical Study of Pseudo-Labeling for Image-based 3D Object
Detection [72.30883544352918]
We investigate whether pseudo-labels can provide effective supervision for the baseline models under varying settings.
We achieve 20.23 AP for moderate level on the KITTI-3D testing set without bells and whistles, improving the baseline model by 6.03 AP.
We hope this work can provide insights for the image-based 3D detection community under a semi-supervised setting.
arXiv Detail & Related papers (2022-08-15T12:17:46Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - Unsupervised Object Detection with LiDAR Clues [70.73881791310495]
We present the first practical method for unsupervised object detection with the aid of LiDAR clues.
In our approach, candidate object segments based on 3D point clouds are firstly generated.
Then, an iterative segment labeling process is conducted to assign segment labels and to train a segment labeling network.
The labeling process is carefully designed so as to mitigate the issue of long-tailed and open-ended distribution.
arXiv Detail & Related papers (2020-11-25T18:59:54Z) - 3D for Free: Crossmodal Transfer Learning using HD Maps [36.70550754737353]
We leverage the large class-taxonomies of modern 2D datasets and the robustness of state-of-the-art 2D detection methods.
We mine a collection of 1151 unlabeled, multimodal driving logs from an autonomous vehicle.
We show that detector performance increases as we mine more unlabeled data.
arXiv Detail & Related papers (2020-08-24T17:54:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.