Related papers: OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation

URL: http://arxiv.org/abs/2201.07309v1
Date: Tue, 18 Jan 2022 20:55:56 GMT
Title: OSSID: Online Self-Supervised Instance Detection by (and for) Pose Estimation
Authors: Qiao Gu, Brian Okorn, David Held
Abstract summary: Real-time object pose estimation is necessary for many robot manipulation algorithms. We propose the OSSID framework, leveraging a slow zero-shot pose estimator to self-supervise the training of a fast detection algorithm. We show that this self-supervised training exceeds the performance of existing zero-shot detection methods.
Score: 17.78557307620686
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-time object pose estimation is necessary for many robot manipulation algorithms. However, state-of-the-art methods for object pose estimation are trained for a specific set of objects; these methods thus need to be retrained to estimate the pose of each new object, often requiring tens of GPU-days of training for optimal performance. \revisef{In this paper, we propose the OSSID framework,} leveraging a slow zero-shot pose estimator to self-supervise the training of a fast detection algorithm. This fast detector can then be used to filter the input to the pose estimator, drastically improving its inference speed. We show that this self-supervised training exceeds the performance of existing zero-shot detection methods on two widely used object pose estimation and detection datasets, without requiring any human annotations. Further, we show that the resulting method for pose estimation has a significantly faster inference speed, due to the ability to filter out large parts of the image. Thus, our method for self-supervised online learning of a detector (trained using pseudo-labels from a slow pose estimator) leads to accurate pose estimation at real-time speeds, without requiring human annotations. Supplementary materials and code can be found at https://georgegu1997.github.io/OSSID/

Related papers

Good Grasps Only: A data engine for self-supervised fine-tuning of pose estimation using grasp poses for verification [0.0]
We present a novel method for self-supervised fine-tuning of pose estimation for bin-picking. Our approach enables the robot to automatically obtain training data without manual labeling. Our pipeline allows the system to fine-tune while the process is running, removing the need for a learning phase.
arXiv Detail & Related papers (2024-09-17T19:26:21Z)
oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving [4.707950656037167]
autonomous driving systems rely heavily on object detection to avoid collisions and drive safely. Monocular 3D object detectors try to solve this problem by directly predicting 3D bounding boxes and object velocities given a camera image. Recent research estimates time-to-contact in a per-pixel manner and suggests that it is more effective measure than velocity and depth combined. We propose per-object time-to-contact estimation by extending object detection models to additionally predict the time-to-contact attribute for each object.
arXiv Detail & Related papers (2024-05-13T12:34:18Z)
PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator. We create a new training pipeline for object to image matching based on a three-view system. To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z)
Label-Efficient Object Detection via Region Proposal Network Pre-Training [58.50615557874024]
We propose a simple pretext task that provides an effective pre-training for the region proposal network (RPN) In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance.
arXiv Detail & Related papers (2022-11-16T16:28:18Z)
Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing. We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set. By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z)
VideoPose: Estimating 6D object pose from videos [14.210010379733017]
We introduce a simple yet effective algorithm that uses convolutional neural networks to directly estimate object poses from videos. Our proposed network takes a pre-trained 2D object detector as input, and aggregates visual features through a recurrent neural network to make predictions at each frame. Experimental evaluation on the YCB-Video dataset show that our approach is on par with the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-11-20T20:57:45Z)
Analysis of voxel-based 3D object detection methods efficiency for real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper. Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances. Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z)
ZePHyR: Zero-shot Pose Hypothesis Rating [36.52070583343388]
We introduce a novel method for zero-shot object pose estimation in clutter. Our approach uses a hypothesis generation and scoring framework, with a focus on learning a scoring function that generalizes to objects not used for training. We demonstrate how our system can be used by quickly scanning and building a model of a novel object, which can immediately be used by our method for pose estimation.
arXiv Detail & Related papers (2021-04-28T01:48:39Z)
Fast Uncertainty Quantification for Deep Object Pose Estimation [91.09217713805337]
Deep learning-based object pose estimators are often unreliable and overconfident. In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation.
arXiv Detail & Related papers (2020-11-16T06:51:55Z)
A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds [54.73161039445703]
We propose a novel self-training approach that enables a typical object detector trained only with point-level annotations. During training, we utilize the available point annotations to supervise the estimation of the center points of objects. Experimental results show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks.
arXiv Detail & Related papers (2020-07-25T02:14:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.