OSSID: Online Self-Supervised Instance Detection by (and for) Pose
Estimation
- URL: http://arxiv.org/abs/2201.07309v1
- Date: Tue, 18 Jan 2022 20:55:56 GMT
- Title: OSSID: Online Self-Supervised Instance Detection by (and for) Pose
Estimation
- Authors: Qiao Gu, Brian Okorn, David Held
- Abstract summary: Real-time object pose estimation is necessary for many robot manipulation algorithms.
We propose the OSSID framework, leveraging a slow zero-shot pose estimator to self-supervise the training of a fast detection algorithm.
We show that this self-supervised training exceeds the performance of existing zero-shot detection methods.
- Score: 17.78557307620686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time object pose estimation is necessary for many robot manipulation
algorithms. However, state-of-the-art methods for object pose estimation are
trained for a specific set of objects; these methods thus need to be retrained
to estimate the pose of each new object, often requiring tens of GPU-days of
training for optimal performance. \revisef{In this paper, we propose the OSSID
framework,} leveraging a slow zero-shot pose estimator to self-supervise the
training of a fast detection algorithm. This fast detector can then be used to
filter the input to the pose estimator, drastically improving its inference
speed. We show that this self-supervised training exceeds the performance of
existing zero-shot detection methods on two widely used object pose estimation
and detection datasets, without requiring any human annotations. Further, we
show that the resulting method for pose estimation has a significantly faster
inference speed, due to the ability to filter out large parts of the image.
Thus, our method for self-supervised online learning of a detector (trained
using pseudo-labels from a slow pose estimator) leads to accurate pose
estimation at real-time speeds, without requiring human annotations.
Supplementary materials and code can be found at
https://georgegu1997.github.io/OSSID/
Related papers
- Good Grasps Only: A data engine for self-supervised fine-tuning of pose estimation using grasp poses for verification [0.0]
We present a novel method for self-supervised fine-tuning of pose estimation for bin-picking.
Our approach enables the robot to automatically obtain training data without manual labeling.
Our pipeline allows the system to fine-tune while the process is running, removing the need for a learning phase.
arXiv Detail & Related papers (2024-09-17T19:26:21Z) - oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving [4.707950656037167]
autonomous driving systems rely heavily on object detection to avoid collisions and drive safely.
Monocular 3D object detectors try to solve this problem by directly predicting 3D bounding boxes and object velocities given a camera image.
Recent research estimates time-to-contact in a per-pixel manner and suggests that it is more effective measure than velocity and depth combined.
We propose per-object time-to-contact estimation by extending object detection models to additionally predict the time-to-contact attribute for each object.
arXiv Detail & Related papers (2024-05-13T12:34:18Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - Label-Efficient Object Detection via Region Proposal Network
Pre-Training [58.50615557874024]
We propose a simple pretext task that provides an effective pre-training for the region proposal network (RPN)
In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance.
arXiv Detail & Related papers (2022-11-16T16:28:18Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - VideoPose: Estimating 6D object pose from videos [14.210010379733017]
We introduce a simple yet effective algorithm that uses convolutional neural networks to directly estimate object poses from videos.
Our proposed network takes a pre-trained 2D object detector as input, and aggregates visual features through a recurrent neural network to make predictions at each frame.
Experimental evaluation on the YCB-Video dataset show that our approach is on par with the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-11-20T20:57:45Z) - Analysis of voxel-based 3D object detection methods efficiency for
real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper.
Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances.
Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z) - ZePHyR: Zero-shot Pose Hypothesis Rating [36.52070583343388]
We introduce a novel method for zero-shot object pose estimation in clutter.
Our approach uses a hypothesis generation and scoring framework, with a focus on learning a scoring function that generalizes to objects not used for training.
We demonstrate how our system can be used by quickly scanning and building a model of a novel object, which can immediately be used by our method for pose estimation.
arXiv Detail & Related papers (2021-04-28T01:48:39Z) - Fast Uncertainty Quantification for Deep Object Pose Estimation [91.09217713805337]
Deep learning-based object pose estimators are often unreliable and overconfident.
In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation.
arXiv Detail & Related papers (2020-11-16T06:51:55Z) - A Self-Training Approach for Point-Supervised Object Detection and
Counting in Crowds [54.73161039445703]
We propose a novel self-training approach that enables a typical object detector trained only with point-level annotations.
During training, we utilize the available point annotations to supervise the estimation of the center points of objects.
Experimental results show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks.
arXiv Detail & Related papers (2020-07-25T02:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.