CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects
from Point Clouds
- URL: http://arxiv.org/abs/2104.03437v1
- Date: Thu, 8 Apr 2021 00:14:58 GMT
- Title: CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects
from Point Clouds
- Authors: Yijia Weng, He Wang, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan,
Baoquan Chen, Hao Su, Leonidas J. Guibas
- Abstract summary: We propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances and per-part pose tracking for articulated objects.
Our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS 12.
- Score: 97.63549045541296
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we tackle the problem of category-level online pose tracking of
objects from point cloud sequences. For the first time, we propose a unified
framework that can handle 9DoF pose tracking for novel rigid object instances
as well as per-part pose tracking for articulated objects from known
categories. Here the 9DoF pose, comprising 6D pose and 3D size, is equivalent
to a 3D amodal bounding box representation with free 6D pose. Given the depth
point cloud at the current frame and the estimated pose from the last frame,
our novel end-to-end pipeline learns to accurately update the pose. Our
pipeline is composed of three modules: 1) a pose canonicalization module that
normalizes the pose of the input depth point cloud; 2) RotationNet, a module
that directly regresses small interframe delta rotations; and 3) CoordinateNet,
a module that predicts the normalized coordinates and segmentation, enabling
analytical computation of the 3D size and translation. Leveraging the small
pose regime in the pose-canonicalized point clouds, our method integrates the
best of both worlds by combining dense coordinate prediction and direct
rotation regression, thus yielding an end-to-end differentiable pipeline
optimized for 9DoF pose accuracy (without using non-differentiable RANSAC). Our
extensive experiments demonstrate that our method achieves new state-of-the-art
performance on category-level rigid object pose (NOCS-REAL275) and articulated
object pose benchmarks (SAPIEN , BMVC) at the fastest FPS ~12.
Related papers
- Shape-Constraint Recurrent Flow for 6D Object Pose Estimation [15.238626453460666]
We propose a shape-constraint recurrent matching framework for 6D object pose estimation.
We first compute a pose-induced flow based on the displacement of 2D reprojection between the initial pose and the currently estimated pose.
We then use this pose-induced flow to construct the correlation map for the following matching iterations.
arXiv Detail & Related papers (2023-06-23T02:36:34Z) - RelPose++: Recovering 6D Poses from Sparse-view Observations [66.6922660401558]
We address the task of estimating 6D camera poses from sparse-view image sets (2-8 images)
We build on the recent RelPose framework which learns a network that infers distributions over relative rotations over image pairs.
Our final system results in large improvements in 6D pose prediction over prior art on both seen and unseen object categories.
arXiv Detail & Related papers (2023-05-08T17:59:58Z) - Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object.
Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z) - GPV-Pose: Category-level Object Pose Estimation via Geometry-guided
Point-wise Voting [103.74918834553249]
GPV-Pose is a novel framework for robust category-level pose estimation.
It harnesses geometric insights to enhance the learning of category-level pose-sensitive features.
It produces superior results to state-of-the-art competitors on common public benchmarks.
arXiv Detail & Related papers (2022-03-15T13:58:50Z) - ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes [55.689763519293464]
ConDor is a self-supervised method that learns to canonicalize the 3D orientation and position for full and partial 3D point clouds.
During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose.
arXiv Detail & Related papers (2022-01-19T18:57:21Z) - Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object
Pose Estimation [30.04752448942084]
Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models.
We propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds.
arXiv Detail & Related papers (2021-10-30T06:46:44Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z) - DualPoseNet: Category-level 6D Object Pose and Size Estimation using
Dual Pose Network with Refined Learning of Pose Consistency [30.214100288708163]
Category-level 6D object pose and size estimation is to predict 9 degrees-of-freedom (9DoF) pose configurations of rotation, translation, and size for object instances.
We propose a new method of Dual Pose Network with refined learning of pose consistency for this task, shortened as DualPoseNet.
arXiv Detail & Related papers (2021-03-11T08:33:47Z) - L6DNet: Light 6 DoF Network for Robust and Precise Object Pose
Estimation with Small Datasets [0.0]
We propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image.
We adopt a hybrid pipeline in two stages: data-driven and geometric.
Our approach is more robust and accurate than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-03T17:41:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.