Related papers: Optimal Pose and Shape Estimation for Category-level 3D Object Perception

Optimal Pose and Shape Estimation for Category-level 3D Object Perception

URL: http://arxiv.org/abs/2104.08383v4
Date: Sun, 17 Sep 2023 02:31:15 GMT
Title: Optimal Pose and Shape Estimation for Category-level 3D Object Perception
Authors: Jingnan Shi, Heng Yang, Luca Carlone
Abstract summary: category-level perception problem, where one is given 3D sensor data picturing an object of a given category. We provide the first certifiably optimal CAD solver for pose and shape estimation. We also develop the first graph-theoretic formulation to prune outliers in category-level perception.
Score: 24.232254155643574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a category-level perception problem, where one is given 3D sensor data picturing an object of a given category (e.g. a car), and has to reconstruct the pose and shape of the object despite intra-class variability (i.e. different car models have different shapes). We consider an active shape model, where -- for an object category -- we are given a library of potential CAD models describing objects in that category, and we adopt a standard formulation where pose and shape estimation are formulated as a non-convex optimization. Our first contribution is to provide the first certifiably optimal solver for pose and shape estimation. In particular, we show that rotation estimation can be decoupled from the estimation of the object translation and shape, and we demonstrate that (i) the optimal object rotation can be computed via a tight (small-size) semidefinite relaxation, and (ii) the translation and shape parameters can be computed in closed-form given the rotation. Our second contribution is to add an outlier rejection layer to our solver, hence making it robust to a large number of misdetections. Towards this goal, we wrap our optimal solver in a robust estimation scheme based on graduated non-convexity. To further enhance robustness to outliers, we also develop the first graph-theoretic formulation to prune outliers in category-level perception, which removes outliers via convex hull and maximum clique computations; the resulting approach is robust to 70%-90% outliers. Our third contribution is an extensive experimental evaluation. Besides providing an ablation study on a simulated dataset and on the PASCAL3D+ dataset, we combine our solver with a deep-learned keypoint detector, and show that the resulting approach improves over the state of the art in vehicle pose estimation in the ApolloScape datasets.

Related papers

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation [21.51021467386653]
We consider the problem of estimating object pose and shape from an RGB-D image. We introduce CRISP, a category-agnostic object pose and shape estimation pipeline. We also propose an optimization-based pose and shape corrector that can correct estimation errors caused by a domain gap.
arXiv Detail & Related papers (2024-12-02T02:26:21Z)
PMPNet: Pixel Movement Prediction Network for Monocular Depth Estimation in Dynamic Scenes [7.736445799116692]
We propose a novel method for monocular depth estimation in dynamic scenes. We first explore the arbitrariness of object's movement trajectory in dynamic scenes theoretically. To overcome the depth inconsistency problem around the edges, we propose a deformable support window module.
arXiv Detail & Related papers (2024-11-04T03:42:29Z)
DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses. We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass. Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z)
Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors [15.34487368683311]
We propose a framework that can reconstruct high-quality object-level maps for unknown objects. Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses for detected objects. We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions.
arXiv Detail & Related papers (2023-09-17T00:48:19Z)
Generative Category-Level Shape and Pose Estimation with Semantic Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image. To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space. We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z)
RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation [103.74918834553247]
Category-level object pose estimation aims to predict the 6D pose as well as the 3D metric size of arbitrary objects from a known set of categories. Recent methods harness shape prior adaptation to map the observed point cloud into the canonical space and apply Umeyama algorithm to recover the pose and size. We propose a novel geometry-guided Residual Object Bounding Box Projection network RBP-Pose that jointly predicts object pose and residual vectors.
arXiv Detail & Related papers (2022-07-30T14:45:20Z)
Optimal and Robust Category-level Perception: Object Pose and Shape Estimation from 2D and 3D Semantic Keypoints [24.232254155643574]
We consider a problem where one is given 2D or 3D sensor data picturing an object of a given category (e.g., a car) and has to reconstruct the 3D pose and shape of the object. Our first contribution is to develop PACE3D* and PACE2D*, the first certifiably optimal solvers for pose and shape estimation. Our second contribution is to developrobust versions of both solvers, named PACE3D# and PACE2D#.
arXiv Detail & Related papers (2022-06-24T21:58:00Z)
Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths. The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images. Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z)
FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics. Recent neural implicit modeling methods show promising results on synthetic or dense datasets. But, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z)
Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image. We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z)
CAE-LO: LiDAR Odometry Leveraging Fully Unsupervised Convolutional Auto-Encoder for Interest Point Detection and Feature Description [10.73965992177754]
We propose a fully unsupervised Conal Auto-Encoder based LiDAR Odometry (CAE-LO) that detects interest points from spherical ring data using 2D CAE and extracts features from multi-resolution voxel model using 3D CAE. We make several key contributions: 1) experiments based on KITTI dataset show that our interest points can capture more local details to improve the matching success rate on unstructured scenarios and our features outperform state-of-the-art by more than 50% in matching inlier ratio.
arXiv Detail & Related papers (2020-01-06T01:26:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.