Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted
Robot Manipulation
- URL: http://arxiv.org/abs/2211.00960v1
- Date: Wed, 2 Nov 2022 08:57:20 GMT
- Title: Ambiguity-Aware Multi-Object Pose Optimization for Visually-Assisted
Robot Manipulation
- Authors: Myung-Hwan Jeon, Jeongyun Kim, Jee-Hwan Ryu, and Ayoung Kim
- Abstract summary: We present an ambiguity-aware 6D object pose estimation network, PrimA6D++, as a generic uncertainty prediction method.
The proposed method shows a significant performance improvement in T-LESS and YCB-Video datasets.
We further demonstrate real-time scene recognition capability for visually-assisted robot manipulation.
- Score: 17.440729138126162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 6D object pose estimation aims to infer the relative pose between the object
and the camera using a single image or multiple images. Most works have focused
on predicting the object pose without associated uncertainty under occlusion
and structural ambiguity (symmetricity). However, these works demand prior
information about shape attributes, and this condition is hardly satisfied in
reality; even asymmetric objects may be symmetric under the viewpoint change.
In addition, acquiring and fusing diverse sensor data is challenging when
extending them to robotics applications. Tackling these limitations, we present
an ambiguity-aware 6D object pose estimation network, PrimA6D++, as a generic
uncertainty prediction method. The major challenges in pose estimation, such as
occlusion and symmetry, can be handled in a generic manner based on the
measured ambiguity of the prediction. Specifically, we devise a network to
reconstruct the three rotation axis primitive images of a target object and
predict the underlying uncertainty along each primitive axis. Leveraging the
estimated uncertainty, we then optimize multi-object poses using visual
measurements and camera poses by treating it as an object SLAM problem. The
proposed method shows a significant performance improvement in T-LESS and
YCB-Video datasets. We further demonstrate real-time scene recognition
capability for visually-assisted robot manipulation. Our code and supplementary
materials are available at https://github.com/rpmsnu/PrimA6D.
Related papers
- DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Extreme Two-View Geometry From Object Poses with Diffusion Models [21.16779160086591]
We harness the power of object priors to accurately determine two-view geometry in the face of extreme viewpoint changes.
In experiments, our method has demonstrated extraordinary robustness and resilience to large viewpoint changes.
arXiv Detail & Related papers (2024-02-05T08:18:47Z) - SyMFM6D: Symmetry-aware Multi-directional Fusion for Multi-View 6D
Object Pose Estimation [16.460390441848464]
We present a novel symmetry-aware multi-view 6D pose estimator called SyMFM6D.
Our approach efficiently fuses the RGB-D frames from multiple perspectives in a deep multi-directional fusion network.
We show that our approach is robust towards inaccurate camera calibration and dynamic camera setups.
arXiv Detail & Related papers (2023-07-01T11:28:53Z) - Context-aware 6D Pose Estimation of Known Objects using RGB-D data [3.48122098223937]
6D object pose estimation has been a research topic in the field of computer vision and robotics.
We present an architecture that, unlike prior work, is context-aware.
Our experiments show an enhancement in the accuracy of about 3.2% over the LineMOD dataset.
arXiv Detail & Related papers (2022-12-11T18:01:01Z) - Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection.
Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon.
Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - VIPose: Real-time Visual-Inertial 6D Object Pose Tracking [3.44942675405441]
We introduce a novel Deep Neural Network (DNN) called VIPose to address the object pose tracking problem in real-time.
The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose.
The approach presents accuracy performances comparable to state-of-the-art techniques, but with additional benefit to be real-time.
arXiv Detail & Related papers (2021-07-27T06:10:23Z) - Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose
Estimation [74.76155168705975]
Deep Bingham Networks (DBN) can handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data.
DBN extends the state of the art direct pose regression networks by (i) a multi-hypotheses prediction head which can yield different distribution modes.
We propose new training strategies so as to avoid mode or posterior collapse during training and to improve numerical stability.
arXiv Detail & Related papers (2020-12-20T19:20:26Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.