Supervision by Registration and Triangulation for Landmark Detection
- URL: http://arxiv.org/abs/2101.09866v1
- Date: Mon, 25 Jan 2021 02:48:21 GMT
- Title: Supervision by Registration and Triangulation for Landmark Detection
- Authors: Xuanyi Dong, Yi Yang, Shih-En Wei, Xinshuo Weng, Yaser Sheikh, Shoou-I
Yu
- Abstract summary: We present Supervision by Registration and Triangulation (SRT), an unsupervised approach that utilizes unlabeled multi-view video to improve the accuracy and precision of landmark detectors.
Being able to utilize unlabeled data enables our detectors to learn from massive amounts of unlabeled data freely available.
- Score: 70.13440728689231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Supervision by Registration and Triangulation (SRT), an
unsupervised approach that utilizes unlabeled multi-view video to improve the
accuracy and precision of landmark detectors. Being able to utilize unlabeled
data enables our detectors to learn from massive amounts of unlabeled data
freely available and not be limited by the quality and quantity of manual human
annotations. To utilize unlabeled data, there are two key observations: (1) the
detections of the same landmark in adjacent frames should be coherent with
registration, i.e., optical flow. (2) the detections of the same landmark in
multiple synchronized and geometrically calibrated views should correspond to a
single 3D point, i.e., multi-view consistency. Registration and multi-view
consistency are sources of supervision that do not require manual labeling,
thus it can be leveraged to augment existing training data during detector
training. End-to-end training is made possible by differentiable registration
and 3D triangulation modules. Experiments with 11 datasets and a newly proposed
metric to measure precision demonstrate accuracy and precision improvements in
landmark detection on both images and video. Code is available at
https://github.com/D-X-Y/landmark-detection.
Related papers
- Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object
Detection [55.210991151015534]
We present a novel Dual-Perspective Knowledge Enrichment approach named DPKE for semi-supervised 3D object detection.
Our DPKE enriches the knowledge of limited training data, particularly unlabeled data, from two perspectives: data-perspective and feature-perspective.
arXiv Detail & Related papers (2024-01-10T08:56:07Z) - Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency [78.76508318592552]
Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application.
Most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase.
We propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images.
arXiv Detail & Related papers (2023-03-15T15:14:00Z) - GraffMatch: Global Matching of 3D Lines and Planes for Wide Baseline
LiDAR Registration [41.00550745153015]
Using geometric landmarks like lines and planes can increase navigation accuracy and decrease map storage requirements.
However, landmark-based registration for applications like loop closure detection is challenging because a reliable initial guess is not available.
We adopt the affine Grassmannian manifold to represent 3D lines and planes and prove that the distance between two landmarks is invariant to rotation and translation.
arXiv Detail & Related papers (2022-12-24T15:02:15Z) - GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation [70.75100533512021]
In this paper, we formulate the label uncertainty problem as the diversity of potentially plausible bounding boxes of objects.
We propose GLENet, a generative framework adapted from conditional variational autoencoders, to model the one-to-many relationship between a typical 3D object and its potential ground-truth bounding boxes with latent variables.
The label uncertainty generated by GLENet is a plug-and-play module and can be conveniently integrated into existing deep 3D detectors.
arXiv Detail & Related papers (2022-07-06T06:26:17Z) - Self-Supervised Person Detection in 2D Range Data using a Calibrated
Camera [83.31666463259849]
We propose a method to automatically generate training labels (called pseudo-labels) for 2D LiDAR-based person detectors.
We show that self-supervised detectors, trained or fine-tuned with pseudo-labels, outperform detectors trained using manual annotations.
Our method is an effective way to improve person detectors during deployment without any additional labeling effort.
arXiv Detail & Related papers (2020-12-16T12:10:04Z) - Move to See Better: Self-Improving Embodied Object Detection [35.461141354989714]
We propose a method for improving object detection in testing environments.
Our agent collects multi-view data, generates 2D and 3D pseudo-labels, and fine-tunes its detector in a self-supervised manner.
arXiv Detail & Related papers (2020-11-30T19:16:51Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.