BOP Challenge 2020 on 6D Object Localization
- URL: http://arxiv.org/abs/2009.07378v2
- Date: Tue, 13 Oct 2020 12:09:44 GMT
- Title: BOP Challenge 2020 on 6D Object Localization
- Authors: Tomas Hodan, Martin Sundermeyer, Bertram Drost, Yann Labbe, Eric
Brachmann, Frank Michel, Carsten Rother, Jiri Matas
- Abstract summary: The BOP Challenge 2020 is the third in a series of public competitions organized with the goal to capture the status quo in the field of 6D object pose estimation from an RGB-D image.
The participants were provided 350K training images generated by BlenderProc4BOP, a new open-source and light-weight physically-based (PBR) and procedural data generator.
The top-performing methods rely on RGB-D image channels, but strong results were achieved when only RGB channels were used at both training and test time.
- Score: 56.591561228575635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the evaluation methodology, datasets, and results of the
BOP Challenge 2020, the third in a series of public competitions organized with
the goal to capture the status quo in the field of 6D object pose estimation
from an RGB-D image. In 2020, to reduce the domain gap between synthetic
training and real test RGB images, the participants were provided 350K
photorealistic training images generated by BlenderProc4BOP, a new open-source
and light-weight physically-based renderer (PBR) and procedural data generator.
Methods based on deep neural networks have finally caught up with methods based
on point pair features, which were dominating previous editions of the
challenge. Although the top-performing methods rely on RGB-D image channels,
strong results were achieved when only RGB channels were used at both training
and test time - out of the 26 evaluated methods, the third method was trained
on RGB channels of PBR and real images, while the fifth on RGB channels of PBR
images only. Strong data augmentation was identified as a key component of the
top-performing CosyPose method, and the photorealism of PBR images was
demonstrated effective despite the augmentation. The online evaluation system
stays open and is available on the project website: bop.felk.cvut.cz.
Related papers
- RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose
Estimation [12.773040823634908]
We propose a self-supervised domain adaptation approach to acquire labeled 6D poses from real images.
We first pre-train the model with synthetic RGB images and then utilize real RGB(-D) images to fine-tune the pre-trained model.
We experimentally demonstrate that our method achieves comparable performance against its fully-supervised counterpart.
arXiv Detail & Related papers (2023-02-14T19:34:41Z) - A Combined Approach Toward Consistent Reconstructions of Indoor Spaces
Based on 6D RGB-D Odometry and KinectFusion [7.503338065129185]
We propose a 6D RGB-D odometry approach that finds the relative camera pose between consecutive RGB-D frames by keypoint extraction.
We feed the estimated pose to the highly accurate KinectFusion algorithm, which fine-tune the frame-to-frame relative pose.
Our algorithm outputs a ready-to-use polygon mesh (highly suitable for creating 3D virtual worlds) without any postprocessing steps.
arXiv Detail & Related papers (2022-12-25T22:52:25Z) - Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic
Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest.
We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D.
The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z) - Towards Two-view 6D Object Pose Estimation: A Comparative Study on
Fusion Strategy [16.65699606802237]
Current RGB-based 6D object pose estimation methods have achieved noticeable performance on datasets and real world applications.
This paper proposes a framework for 6D object pose estimation that learns implicit 3D information from 2 RGB images.
arXiv Detail & Related papers (2022-07-01T08:22:34Z) - Blind Face Restoration: Benchmark Datasets and a Baseline Model [63.053331687284064]
Blind Face Restoration (BFR) aims to construct a high-quality (HQ) face image from its corresponding low-quality (LQ) input.
We first synthesize two blind face restoration benchmark datasets called EDFace-Celeb-1M (BFR128) and EDFace-Celeb-150K (BFR512)
State-of-the-art methods are benchmarked on them under five settings including blur, noise, low resolution, JPEG compression artifacts, and the combination of them (full degradation)
arXiv Detail & Related papers (2022-06-08T06:34:24Z) - Semantic-embedded Unsupervised Spectral Reconstruction from Single RGB
Images in the Wild [48.44194221801609]
We propose a new lightweight and end-to-end learning-based framework to tackle this challenge.
We progressively spread the differences between input RGB images and re-projected RGB images from recovered HS images via effective camera spectral response function estimation.
Our method significantly outperforms state-of-the-art unsupervised methods and even exceeds the latest supervised method under some settings.
arXiv Detail & Related papers (2021-08-15T05:19:44Z) - A New Mask R-CNN Based Method for Improved Landslide Detection [54.7905160534631]
This paper presents a novel method of landslide detection by exploiting the Mask R-CNN capability of identifying an object layout.
A data set of 160 elements is created containing landslide and non-landslide images.
The proposed algorithm can be potentially useful for land use planners and policy makers of hilly areas.
arXiv Detail & Related papers (2020-10-04T07:46:37Z) - NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image [61.71186808848108]
This paper reviews the second challenge on spectral reconstruction from RGB images.
It recovers whole-scene hyperspectral (HS) information from a 3-channel RGB image.
A new, larger-than-ever, natural hyperspectral image data set is presented.
arXiv Detail & Related papers (2020-05-07T12:23:56Z) - Two-Level Attention-based Fusion Learning for RGB-D Face Recognition [21.735238213921804]
A novel attention aware method is proposed to fuse two image modalities, RGB and depth, for enhanced RGB-D facial recognition.
The proposed method first extracts features from both modalities using a convolutional feature extractor.
These features are then fused using a two-layer attention mechanism.
arXiv Detail & Related papers (2020-02-29T03:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.