Image to Sphere: Learning Equivariant Features for Efficient Pose
Prediction
- URL: http://arxiv.org/abs/2302.13926v1
- Date: Mon, 27 Feb 2023 16:23:19 GMT
- Title: Image to Sphere: Learning Equivariant Features for Efficient Pose
Prediction
- Authors: David M. Klee and Ondrej Biza and Robert Platt and Robin Walters
- Abstract summary: Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty.
We propose a novel mapping of features from the image domain to the 3D rotation manifold.
We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset.
- Score: 3.823356975862006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting the pose of objects from a single image is an important but
difficult computer vision problem. Methods that predict a single point estimate
do not predict the pose of objects with symmetries well and cannot represent
uncertainty. Alternatively, some works predict a distribution over orientations
in $\mathrm{SO}(3)$. However, training such models can be computation- and
sample-inefficient. Instead, we propose a novel mapping of features from the
image domain to the 3D rotation manifold. Our method then leverages
$\mathrm{SO}(3)$ equivariant layers, which are more sample efficient, and
outputs a distribution over rotations that can be sampled at arbitrary
resolution. We demonstrate the effectiveness of our method at object
orientation prediction, and achieve state-of-the-art performance on the popular
PASCAL3D+ dataset. Moreover, we show that our method can model complex object
symmetries, without any modifications to the parameters or loss function. Code
is available at https://dmklee.github.io/image2sphere.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - Diff-DOPE: Differentiable Deep Object Pose Estimation [29.703385848843414]
We introduce Diff-DOPE, a 6-DoF pose refiner that takes as input an image, a 3D textured model of an object, and an initial pose of the object.
The method uses differentiable rendering to update the object pose to minimize the visual error between the image and the projection of the model.
We show that this simple, yet effective, idea is able to achieve state-of-the-art results on pose estimation datasets.
arXiv Detail & Related papers (2023-09-30T18:52:57Z) - Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors [15.34487368683311]
We propose a framework that can reconstruct high-quality object-level maps for unknown objects.
Our approach takes multiple RGB-D images as input and outputs dense 3D shapes and 9-DoF poses for detected objects.
We derive a probabilistic formulation that propagates shape and pose uncertainty through two novel loss functions.
arXiv Detail & Related papers (2023-09-17T00:48:19Z) - EFEM: Equivariant Neural Field Expectation Maximization for 3D Object
Segmentation Without Scene Supervision [35.232051353760035]
We introduce Equivariant Neural Field Expectation Maximization (EFEM) to segment objects in 3D scenes without annotations or training on scenes.
First, we introduce equivariant shape representations to this problem to eliminate the complexity induced by the variation in object configuration.
Second, we propose a novel EM algorithm that can iteratively refine segmentation masks using the equivariant shape prior.
arXiv Detail & Related papers (2023-03-27T17:59:29Z) - Learning Implicit Probability Distribution Functions for Symmetric
Orientation Estimation from RGB Images Without Pose Labels [23.01797447932351]
We propose an automatic pose labeling scheme for RGB-D images.
We train an ImplicitPDF model to estimate the likelihood of an orientation hypothesis given an RGB image.
An efficient hierarchical sampling of the SO(3) manifold enables tractable generation of the complete set of symmetries.
arXiv Detail & Related papers (2022-11-21T12:07:40Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training.
We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z) - Implicit-PDF: Non-Parametric Representation of Probability Distributions
on the Rotation Manifold [47.31074799708132]
We introduce a method to estimate arbitrary, non-parametric distributions on SO(3).
Our key idea is to represent the distributions implicitly, with a neural network that estimates the probability given the input image and a candidate pose.
We achieve state-of-the-art performance on Pascal3D+ and ModelNet10-SO(3) benchmarks.
arXiv Detail & Related papers (2021-06-10T17:57:23Z) - Neural Articulated Radiance Field [90.91714894044253]
We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images.
Experiments show that the proposed method is efficient and can generalize well to novel poses.
arXiv Detail & Related papers (2021-04-07T13:23:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.