Optimal Pose and Shape Estimation for Category-level 3D Object
Perception
- URL: http://arxiv.org/abs/2104.08383v4
- Date: Sun, 17 Sep 2023 02:31:15 GMT
- Title: Optimal Pose and Shape Estimation for Category-level 3D Object
Perception
- Authors: Jingnan Shi, Heng Yang, Luca Carlone
- Abstract summary: category-level perception problem, where one is given 3D sensor data picturing an object of a given category.
We provide the first certifiably optimal CAD solver for pose and shape estimation.
We also develop the first graph-theoretic formulation to prune outliers in category-level perception.
- Score: 24.232254155643574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a category-level perception problem, where one is given 3D sensor
data picturing an object of a given category (e.g. a car), and has to
reconstruct the pose and shape of the object despite intra-class variability
(i.e. different car models have different shapes). We consider an active shape
model, where -- for an object category -- we are given a library of potential
CAD models describing objects in that category, and we adopt a standard
formulation where pose and shape estimation are formulated as a non-convex
optimization. Our first contribution is to provide the first certifiably
optimal solver for pose and shape estimation. In particular, we show that
rotation estimation can be decoupled from the estimation of the object
translation and shape, and we demonstrate that (i) the optimal object rotation
can be computed via a tight (small-size) semidefinite relaxation, and (ii) the
translation and shape parameters can be computed in closed-form given the
rotation. Our second contribution is to add an outlier rejection layer to our
solver, hence making it robust to a large number of misdetections. Towards this
goal, we wrap our optimal solver in a robust estimation scheme based on
graduated non-convexity. To further enhance robustness to outliers, we also
develop the first graph-theoretic formulation to prune outliers in
category-level perception, which removes outliers via convex hull and maximum
clique computations; the resulting approach is robust to 70%-90% outliers. Our
third contribution is an extensive experimental evaluation. Besides providing
an ablation study on a simulated dataset and on the PASCAL3D+ dataset, we
combine our solver with a deep-learned keypoint detector, and show that the
resulting approach improves over the state of the art in vehicle pose
estimation in the ApolloScape datasets.
Related papers
- PMPNet: Pixel Movement Prediction Network for Monocular Depth Estimation in Dynamic Scenes [7.736445799116692]
We propose a novel method for monocular depth estimation in dynamic scenes.
We first explore the arbitrariness of object's movement trajectory in dynamic scenes theoretically.
To overcome the depth inconsistency problem around the edges, we propose a deformable support window module.
arXiv Detail & Related papers (2024-11-04T03:42:29Z) - DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors [15.34487368683311]
We propose a framework that can reconstruct high-quality object-level maps for unknown objects.
Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses for detected objects.
We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions.
arXiv Detail & Related papers (2023-09-17T00:48:19Z) - Generative Category-Level Shape and Pose Estimation with Semantic
Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image.
To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space.
We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z) - RBP-Pose: Residual Bounding Box Projection for Category-Level Pose
Estimation [103.74918834553247]
Category-level object pose estimation aims to predict the 6D pose as well as the 3D metric size of arbitrary objects from a known set of categories.
Recent methods harness shape prior adaptation to map the observed point cloud into the canonical space and apply Umeyama algorithm to recover the pose and size.
We propose a novel geometry-guided Residual Object Bounding Box Projection network RBP-Pose that jointly predicts object pose and residual vectors.
arXiv Detail & Related papers (2022-07-30T14:45:20Z) - Optimal and Robust Category-level Perception: Object Pose and Shape
Estimation from 2D and 3D Semantic Keypoints [24.232254155643574]
We consider a problem where one is given 2D or 3D sensor data picturing an object of a given category (e.g., a car) and has to reconstruct the 3D pose and shape of the object.
Our first contribution is to develop PACE3D* and PACE2D*, the first certifiably optimal solvers for pose and shape estimation.
Our second contribution is to developrobust versions of both solvers, named PACE3D# and PACE2D#.
arXiv Detail & Related papers (2022-06-24T21:58:00Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Shape Prior Deformation for Categorical 6D Object Pose and Size
Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image.
We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z) - CAE-LO: LiDAR Odometry Leveraging Fully Unsupervised Convolutional
Auto-Encoder for Interest Point Detection and Feature Description [10.73965992177754]
We propose a fully unsupervised Conal Auto-Encoder based LiDAR Odometry (CAE-LO) that detects interest points from spherical ring data using 2D CAE and extracts features from multi-resolution voxel model using 3D CAE.
We make several key contributions: 1) experiments based on KITTI dataset show that our interest points can capture more local details to improve the matching success rate on unstructured scenarios and our features outperform state-of-the-art by more than 50% in matching inlier ratio.
arXiv Detail & Related papers (2020-01-06T01:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.