For A More Comprehensive Evaluation of 6DoF Object Pose Tracking
- URL: http://arxiv.org/abs/2309.07796v2
- Date: Fri, 15 Sep 2023 02:30:08 GMT
- Title: For A More Comprehensive Evaluation of 6DoF Object Pose Tracking
- Authors: Yang Li, Fan Zhong, Xin Wang, Shuangbing Song, Jiachen Li, Xueying Qin
and Changhe Tu
- Abstract summary: We contribute a unified benchmark to address the above problems.
For more accurate annotation of YCBV, we propose a multi-view multi-object global pose refinement method.
In experiments, we validate the precision and reliability of the proposed global pose refinement method with a realistic semi-synthesized dataset.
- Score: 22.696375341994035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous evaluations on 6DoF object pose tracking have presented obvious
limitations along with the development of this area. In particular, the
evaluation protocols are not unified for different methods, the widely-used
YCBV dataset contains significant annotation error, and the error metrics also
may be biased. As a result, it is hard to fairly compare the methods, which has
became a big obstacle for developing new algorithms. In this paper we
contribute a unified benchmark to address the above problems. For more accurate
annotation of YCBV, we propose a multi-view multi-object global pose refinement
method, which can jointly refine the poses of all objects and view cameras,
resulting in sub-pixel sub-millimeter alignment errors. The limitations of
previous scoring methods and error metrics are analyzed, based on which we
introduce our improved evaluation methods. The unified benchmark takes both
YCBV and BCOT as base datasets, which are shown to be complementary in scene
categories. In experiments, we validate the precision and reliability of the
proposed global pose refinement method with a realistic semi-synthesized
dataset particularly for YCBV, and then present the benchmark results unifying
learning&non-learning and RGB&RGBD methods, with some finds not discovered in
previous studies.
Related papers
- SEMPose: A Single End-to-end Network for Multi-object Pose Estimation [13.131534219937533]
SEMPose is an end-to-end multi-object pose estimation network.
It can perform inference at 32 FPS without requiring inputs other than the RGB image.
It can accurately estimate the poses of multiple objects in real time, with inference time unaffected by the number of target objects.
arXiv Detail & Related papers (2024-11-21T10:37:54Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation [67.12857074801731]
We introduce a novel method, CPPF++, designed for sim-to-real pose estimation.
To address the challenge posed by vote collision, we propose a novel approach that involves modeling the voting uncertainty.
We incorporate several innovative modules, including noisy pair filtering, online alignment optimization, and a feature ensemble.
arXiv Detail & Related papers (2022-11-24T03:27:00Z) - On the Evaluation of RGB-D-based Categorical Pose and Shape Estimation [5.71097144710995]
In this work we take a critical look at this predominant evaluation protocol including metrics and datasets.
We propose a new set of metrics, contribute new annotations for the Redwood dataset and evaluate state-of-the-art methods in a fair comparison.
arXiv Detail & Related papers (2022-02-21T16:31:18Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - TISE: A Toolbox for Text-to-Image Synthesis Evaluation [9.092600296992925]
We conduct a study on state-of-the-art methods for single- and multi-object text-to-image synthesis.
We propose a common framework for evaluating these methods.
arXiv Detail & Related papers (2021-12-02T16:39:35Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - A Hybrid Approach for 6DoF Pose Estimation [4.200736775540874]
We propose a method for 6DoF pose estimation using a state-of-the-art deep learning based instance detector.
We additionally use an automatic method selection that chooses the instance detector and the training set as that with the highest performance on the validation set.
This hybrid approach leverages the best of learning and classic approaches, using CNNs to filter highly unstructured data and cut through the clutter, and a local geometric approach with proven convergence for robust pose estimation.
arXiv Detail & Related papers (2020-11-11T09:58:23Z) - REDE: End-to-end Object 6D Pose Robust Estimation Using Differentiable
Outliers Elimination [15.736699709454857]
We propose REDE, a novel end-to-end object pose estimator using RGB-D data.
We also propose a differentiable outliers elimination method that regresses the candidate result and the confidence simultaneously.
The experimental results on three benchmark datasets show that REDE slightly outperforms the state-of-the-art approaches.
arXiv Detail & Related papers (2020-10-24T06:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.