MV6D: Multi-View 6D Pose Estimation on RGB-D Frames Using a Deep
Point-wise Voting Network
- URL: http://arxiv.org/abs/2208.01172v1
- Date: Mon, 1 Aug 2022 23:34:43 GMT
- Title: MV6D: Multi-View 6D Pose Estimation on RGB-D Frames Using a Deep
Point-wise Voting Network
- Authors: Fabian Duffhauss, Tobias Demmler, Gerhard Neumann
- Abstract summary: We present a novel multi-view 6D pose estimation method called MV6D.
We base our approach on the PVN3D network that uses a single RGB-D image to predict keypoints of the target objects.
In contrast to current multi-view pose detection networks such as CosyPose, our MV6D can learn the fusion of multiple perspectives in an end-to-end manner.
- Score: 14.754297065772676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating 6D poses of objects is an essential computer vision task. However,
most conventional approaches rely on camera data from a single perspective and
therefore suffer from occlusions. We overcome this issue with our novel
multi-view 6D pose estimation method called MV6D which accurately predicts the
6D poses of all objects in a cluttered scene based on RGB-D images from
multiple perspectives. We base our approach on the PVN3D network that uses a
single RGB-D image to predict keypoints of the target objects. We extend this
approach by using a combined point cloud from multiple views and fusing the
images from each view with a DenseFusion layer. In contrast to current
multi-view pose detection networks such as CosyPose, our MV6D can learn the
fusion of multiple perspectives in an end-to-end manner and does not require
multiple prediction stages or subsequent fine tuning of the prediction.
Furthermore, we present three novel photorealistic datasets of cluttered scenes
with heavy occlusions. All of them contain RGB-D images from multiple
perspectives and the ground truth for instance semantic segmentation and 6D
pose estimation. MV6D significantly outperforms the state-of-the-art in
multi-view 6D pose estimation even in cases where the camera poses are known
inaccurately. Furthermore, we show that our approach is robust towards dynamic
camera setups and that its accuracy increases incrementally with an increasing
number of perspectives.
Related papers
- SyMFM6D: Symmetry-aware Multi-directional Fusion for Multi-View 6D
Object Pose Estimation [16.460390441848464]
We present a novel symmetry-aware multi-view 6D pose estimator called SyMFM6D.
Our approach efficiently fuses the RGB-D frames from multiple perspectives in a deep multi-directional fusion network.
We show that our approach is robust towards inaccurate camera calibration and dynamic camera setups.
arXiv Detail & Related papers (2023-07-01T11:28:53Z) - Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot,
Generalizable Approach using RGB Images [60.0898989456276]
We present a new framework named Cas6D for few-shot 6DoF pose estimation that is generalizable and uses only RGB images.
To address the false positives of target object detection in the extreme few-shot setting, our framework utilizes a self-supervised pre-trained ViT to learn robust feature representations.
Experimental results on the LINEMOD and GenMOP datasets demonstrate that Cas6D outperforms state-of-the-art methods by 9.2% and 3.8% accuracy (Proj-5) under the 32-shot setting.
arXiv Detail & Related papers (2023-06-13T07:45:42Z) - RelPose++: Recovering 6D Poses from Sparse-view Observations [66.6922660401558]
We address the task of estimating 6D camera poses from sparse-view image sets (2-8 images)
We build on the recent RelPose framework which learns a network that infers distributions over relative rotations over image pairs.
Our final system results in large improvements in 6D pose prediction over prior art on both seen and unseen object categories.
arXiv Detail & Related papers (2023-05-08T17:59:58Z) - Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object.
Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation [73.40404343241782]
We propose a weakly supervised 6D object pose estimation approach based on 2D keypoint detection.
Our approach achieves comparable performance with state-of-the-art fully supervised approaches.
arXiv Detail & Related papers (2022-03-07T16:23:47Z) - CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and
Categorical 6D Pose and Size Estimation [19.284468553414918]
This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation.
Existing approaches mainly follow a complex multi-stage pipeline which first localizes and detects each object instance in the image and then regresses to either their 3D meshes or 6D poses.
We present a simple one-stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner.
arXiv Detail & Related papers (2022-03-03T18:59:04Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - CosyPose: Consistent multi-view multi-object 6D pose estimation [48.097599674329004]
We present a single-view single-object 6D pose estimation method, which we use to generate 6D object pose hypotheses.
Second, we develop a robust method for matching individual 6D object pose hypotheses across different input images.
Third, we develop a method for global scene refinement given multiple object hypotheses and their correspondences across views.
arXiv Detail & Related papers (2020-08-19T14:11:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.