ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion
Models
- URL: http://arxiv.org/abs/2306.17140v2
- Date: Thu, 30 Nov 2023 18:33:12 GMT
- Title: ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion
Models
- Authors: Weihao Cheng, Yan-Pei Cao, Ying Shan
- Abstract summary: We present ID-Pose which inverses the denoising diffusion process to estimate the relative pose given two input images.
We extend ID-Pose to handle more than two images and estimate each pose with multiple image pairs from triangular relations.
Results demonstrate that ID-Pose significantly outperforms state-of-the-art methods.
- Score: 43.86792681109704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given sparse views of a 3D object, estimating their camera poses is a
long-standing and intractable problem. Toward this goal, we consider harnessing
the pre-trained diffusion model of novel views conditioned on viewpoints
(Zero-1-to-3). We present ID-Pose which inverses the denoising diffusion
process to estimate the relative pose given two input images. ID-Pose adds a
noise to one image, and predicts the noise conditioned on the other image and a
hypothesis of the relative pose. The prediction error is used as the
minimization objective to find the optimal pose with the gradient descent
method. We extend ID-Pose to handle more than two images and estimate each pose
with multiple image pairs from triangular relations. ID-Pose requires no
training and generalizes to open-world images. We conduct extensive experiments
using casually captured photos and rendered images with random viewpoints. The
results demonstrate that ID-Pose significantly outperforms state-of-the-art
methods.
Related papers
- SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios.
It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed.
It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z) - Cameras as Rays: Pose Estimation via Ray Diffusion [54.098613859015856]
Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views.
We propose a distributed representation of camera pose that treats a camera as a bundle of rays.
Our proposed methods, both regression- and diffusion-based, demonstrate state-of-the-art performance on camera pose estimation on CO3D.
arXiv Detail & Related papers (2024-02-22T18:59:56Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models [5.908471365011943]
We propose emphDiffPose, a conditional diffusion model that predicts multiple hypotheses for a given input image.
We show that DiffPose slightly improves upon the state of the art for multi-hypothesis pose estimation for simple poses and outperforms it by a large margin for highly ambiguous poses.
arXiv Detail & Related papers (2022-11-29T18:55:13Z) - Stochastic Modeling for Learnable Human Pose Triangulation [0.7646713951724009]
We propose a modeling framework for 3D human pose triangulation and evaluate its performance across different datasets and spatial camera arrangements.
The proposed pose triangulation model successfully generalizes to different camera arrangements and between two public datasets.
arXiv Detail & Related papers (2021-10-01T09:26:25Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation
from Human Images [42.27703025887059]
The main problems with the standard supervised approach are that it often yields anatomically implausible poses.
We propose a semi-supervised method that can make effective use of images with and without pose annotations.
The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible.
arXiv Detail & Related papers (2020-04-08T05:02:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.