A Solution for a Fundamental Problem of 3D Inference based on 2D
Representations
- URL: http://arxiv.org/abs/2211.04691v1
- Date: Wed, 9 Nov 2022 05:37:01 GMT
- Title: A Solution for a Fundamental Problem of 3D Inference based on 2D
Representations
- Authors: Thien An L. Nguyen
- Abstract summary: 3D inference from monocular vision using neural networks is an important research area of computer vision.
This paper provides an explainable and robust-decent solution based on 2D representations for an important special case of the problem.
It opens up a new approach for using available information-based learning methods to solve problems related to 3D object pose estimation from 2D images.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D inference from monocular vision using neural networks is an important
research area of computer vision. Applications of the research area are various
with many proposed solutions and have shown remarkable performance. Although
many efforts have been invested, there are still unanswered questions, some of
which are fundamental. In this paper, I discuss a problem that I hope will come
to be known as a generalization of the Blind Perspective-n-Point (Blind PnP)
problem for object-driven 3D inference based on 2D representations. The vital
difference between the fundamental problem and the Blind PnP problem is that 3D
inference parameters in the fundamental problem are attached directly to 3D
points and the camera concept will be represented through the sharing of the
parameters of these points. By providing an explainable and robust
gradient-decent solution based on 2D representations for an important special
case of the problem, the paper opens up a new approach for using available
information-based learning methods to solve problems related to 3D object pose
estimation from 2D images.
Related papers
- On the Efficacy of 3D Point Cloud Reinforcement Learning [20.4424883945357]
We focus on 3D point clouds, one of the most common forms of 3D representations.
We systematically investigate design choices for 3D point cloud RL, leading to the development of a robust algorithm for various robotic manipulation and control tasks.
We find that 3D point cloud RL can significantly outperform the 2D counterpart when agent-object / object-object relationship encoding is a key factor.
arXiv Detail & Related papers (2023-06-11T22:52:08Z) - DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion
Probabilistic Model [25.223801390996435]
This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection.
We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector.
We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets.
arXiv Detail & Related papers (2022-12-06T07:22:20Z) - Uncertainty Guided Policy for Active Robotic 3D Reconstruction using
Neural Radiance Fields [82.21033337949757]
This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation.
We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator.
We present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations.
arXiv Detail & Related papers (2022-09-17T21:28:57Z) - Perspective-1-Ellipsoid: Formulation, Analysis and Solutions of the
Camera Pose Estimation Problem from One Ellipse-Ellipsoid Correspondence [1.7188280334580193]
We introduce an ellipsoid-specific theoretical framework and demonstrate its beneficial properties in the context of pose estimation.
We show that the proposed formalism enables to reduce the pose estimation problem to a position or orientation-only estimation problem.
arXiv Detail & Related papers (2022-08-26T09:15:20Z) - Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation [73.40404343241782]
We propose a weakly supervised 6D object pose estimation approach based on 2D keypoint detection.
Our approach achieves comparable performance with state-of-the-art fully supervised approaches.
arXiv Detail & Related papers (2022-03-07T16:23:47Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Solving the Blind Perspective-n-Point Problem End-To-End With Robust
Differentiable Geometric Optimization [44.85008070868851]
Blind Perspective-n-Point is the problem estimating the position of a camera relative to a scene.
We propose the first fully end-to-end trainable network for solving the blind geometric problem efficiently globally.
arXiv Detail & Related papers (2020-07-29T06:35:45Z) - Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point
Problem [98.92148855291363]
This paper proposes a deep CNN model which simultaneously solves for both 6-DoF absolute camera pose 2D--3D correspondences.
Tests on both real and simulated data have shown that our method substantially outperforms existing approaches.
arXiv Detail & Related papers (2020-03-15T04:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.