Category-Level Object Shape and Pose Estimation in Less Than a Millisecond
- URL: http://arxiv.org/abs/2509.18979v1
- Date: Tue, 23 Sep 2025 13:29:32 GMT
- Title: Category-Level Object Shape and Pose Estimation in Less Than a Millisecond
- Authors: Lorenzo Shaikewitz, Tim Nguyen, Luca Carlone,
- Abstract summary: We present a fast local solver for shape and pose estimation.<n>We use a learned front-end to detect sparse, category-level semantic keypoints on the target object.<n>One iteration of our solver runs in about 100 microseconds, enabling fast outlier rejection.
- Score: 13.78778327399253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object shape and pose estimation is a foundational robotics problem, supporting tasks from manipulation to scene understanding and navigation. We present a fast local solver for shape and pose estimation which requires only category-level object priors and admits an efficient certificate of global optimality. Given an RGB-D image of an object, we use a learned front-end to detect sparse, category-level semantic keypoints on the target object. We represent the target object's unknown shape using a linear active shape model and pose a maximum a posteriori optimization problem to solve for position, orientation, and shape simultaneously. Expressed in unit quaternions, this problem admits first-order optimality conditions in the form of an eigenvalue problem with eigenvector nonlinearities. Our primary contribution is to solve this problem efficiently with self-consistent field iteration, which only requires computing a 4-by-4 matrix and finding its minimum eigenvalue-vector pair at each iterate. Solving a linear system for the corresponding Lagrange multipliers gives a simple global optimality certificate. One iteration of our solver runs in about 100 microseconds, enabling fast outlier rejection. We test our method on synthetic data and a variety of real-world settings, including two public datasets and a drone tracking scenario. Code is released at https://github.com/MIT-SPARK/Fast-ShapeAndPose.
Related papers
- Decomposed Global Optimization for Robust Point Matching with Low-Dimensional Branching [41.05165517541873]
We introduce a novel global optimization method for align partially overlapping point sets.<n>Our method exhibits superior robustness to non-rigid deformations, positional noise and outliers.<n> Experiments on 2D and 3D synthetic and real-world data demonstrate that our method, compared to state-of-the-art approaches, exhibits superior robustness to outliers.
arXiv Detail & Related papers (2024-05-14T13:28:57Z) - DVMNet++: Rethinking Relative Pose Estimation for Unseen Objects [59.51874686414509]
Existing approaches typically predict 3D translation utilizing the ground-truth object bounding box and approximate 3D rotation with a large number of discrete hypotheses.<n>We present a Deep Voxel Matching Network (DVMNet++) that computes the relative object pose in a single pass.<n>Our approach delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Leveraging Positional Encoding for Robust Multi-Reference-Based Object
6D Pose Estimation [21.900422840817726]
Accurately estimating the pose of an object is a crucial task in computer vision and robotics.
In this paper, we analyze these limitations and propose new strategies to overcome them.
Our experiments on Linemod, Linemod-Occlusion, and YCB-Video datasets demonstrate that our approach outperforms existing methods.
arXiv Detail & Related papers (2024-01-29T16:42:15Z) - Vanishing Point Estimation in Uncalibrated Images with Prior Gravity
Direction [82.72686460985297]
We tackle the problem of estimating a Manhattan frame.
We derive two new 2-line solvers, one of which does not suffer from singularities affecting existing solvers.
We also design a new non-minimal method, running on an arbitrary number of lines, to boost the performance in local optimization.
arXiv Detail & Related papers (2023-08-21T13:03:25Z) - Efficient first-order predictor-corrector multiple objective
optimization for fair misinformation detection [5.139559672771439]
Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning.
We propose a Gauss-Newton approximation that only scales linearly, and that requires only first-order inner-product per iteration.
The innovations make predictor-corrector possible for large networks.
arXiv Detail & Related papers (2022-09-15T12:32:15Z) - Real Time Detection Free Tracking of Multiple Objects Via Equilibrium
Optimizer [0.951828574518325]
Multiple objects tracking (MOT) is a difficult task, as it usually requires special hardware and higher computation.
We present a new framework of MOT by using equilibrium algorithm (EO) and reducing the resolution of the bounding boxes of the objects.
Experimental results confirm that EO multi-object tracker achieves satisfying tracking results.
arXiv Detail & Related papers (2022-05-22T06:04:34Z) - IFOR: Iterative Flow Minimization for Robotic Object Rearrangement [92.97142696891727]
IFOR, Iterative Flow Minimization for Robotic Object Rearrangement, is an end-to-end method for the problem of object rearrangement for unknown objects.
We show that our method applies to cluttered scenes, and in the real world, while training only on synthetic data.
arXiv Detail & Related papers (2022-02-01T20:03:56Z) - Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector
Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously.
We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z) - Globally Optimal Relative Pose Estimation with Gravity Prior [63.74377065002315]
Smartphones, tablets and camera systems used, e.g., in cars and UAVs, are typically equipped with IMUs that can measure the gravity vector accurately.
We propose a novel globally optimal solver, minimizing the algebraic error in the least-squares sense, to estimate the relative pose in the over-determined pose.
The proposed solvers are compared with the state-of-the-art ones on four real-world datasets with approx. 50000 image pairs in total.
arXiv Detail & Related papers (2020-12-01T13:09:59Z) - Factor Graph based 3D Multi-Object Tracking in Point Clouds [8.411514688735183]
We propose a novel optimization-based approach that does not rely on explicit and fixed assignments.
We demonstrate its performance on the real world KITTI tracking dataset and achieve better results than many state-of-the-art algorithms.
arXiv Detail & Related papers (2020-08-12T13:34:46Z) - Robust 6D Object Pose Estimation by Learning RGB-D Features [59.580366107770764]
We propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem.
We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction.
Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-02-29T06:24:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.