Related papers: A Pose-only Solution to Visual Reconstruction and Navigation

A Pose-only Solution to Visual Reconstruction and Navigation

URL: http://arxiv.org/abs/2103.01530v1
Date: Tue, 2 Mar 2021 07:21:08 GMT
Title: A Pose-only Solution to Visual Reconstruction and Navigation
Authors: Qi Cai, Lilian Zhang, Yuanxin Wu, Wenxian Yu, Dewen Hu
Abstract summary: Large-scale scenes and critical camera motions are great challenges facing the research community to achieve this goal. We raised a pose-only imaging geometry framework and algorithms that can help solve these challenges.
Score: 23.86386627769292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual navigation and three-dimensional (3D) scene reconstruction are essential for robotics to interact with the surrounding environment. Large-scale scenes and critical camera motions are great challenges facing the research community to achieve this goal. We raised a pose-only imaging geometry framework and algorithms that can help solve these challenges. The representation is a linear function of camera global translations, which allows for efficient and robust camera motion estimation. As a result, the spatial feature coordinates can be analytically reconstructed and do not require nonlinear optimization. Experiments demonstrate that the computational efficiency of recovering the scene and associated camera poses is significantly improved by 2-4 orders of magnitude. This solution might be promising to unlock real-time 3D visual computing in many forefront applications.

Related papers

FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views [93.6881532277553]
We present FLARE, a feed-forward model designed to infer high-quality camera poses and 3D geometry from uncalibrated sparse-view images. Our solution features a cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
arXiv Detail & Related papers (2025-02-17T18:54:05Z)
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes. By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes. We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z)
MultiViPerFrOG: A Globally Optimized Multi-Viewpoint Perception Framework for Camera Motion and Tissue Deformation [18.261678529996104]
We propose a framework that can flexibly integrate the output of low-level perception modules with kinematic and scene-modeling priors. Overall, our method shows robustness to combined noisy input measures and can process hundreds of points in a few milliseconds.
arXiv Detail & Related papers (2024-08-08T10:55:55Z)
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning. voxelization infers per-object occupancy probabilities at individual spatial locations. Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z)
VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques. We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step. Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z)
DUSt3R: Geometric 3D Vision Made Easy [8.471330244002564]
We introduce DUSt3R, a novel paradigm for Dense and Unconstrained Stereo 3D Reconstruction of arbitrary image collections. We show that this formulation smoothly unifies the monocular and binocular reconstruction cases. Our formulation directly provides a 3D model of the scene as well as depth information, but interestingly, we can seamlessly recover from it, pixel matches, relative and absolute camera.
arXiv Detail & Related papers (2023-12-21T18:52:14Z)
R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation. Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement. We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z)
Lazy Visual Localization via Motion Averaging [89.8709956317671]
We show that it is possible to achieve high localization accuracy without reconstructing the scene from the database. Experiments show that our visual localization proposal, LazyLoc, achieves comparable performance against state-of-the-art structure-based methods.
arXiv Detail & Related papers (2023-07-19T13:40:45Z)
Towards Scalable Multi-View Reconstruction of Geometry and Materials [27.660389147094715]
We propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes. The input are high-resolution RGBD images captured by a mobile, hand-held capture system with point lights for active illumination.
arXiv Detail & Related papers (2023-06-06T15:07:39Z)
Learning Pose-invariant 3D Object Reconstruction from Single-view Images [61.98279201609436]
In this paper, we explore a more realistic setup of learning 3D shape from only single-view images. The major difficulty lies in insufficient constraints that can be provided by single view images. We propose an effective adversarial domain confusion method to learn pose-disentangled compact shape space.
arXiv Detail & Related papers (2020-04-03T02:47:35Z)
Learning Precise 3D Manipulation from Multiple Uncalibrated Cameras [13.24490469380487]
We present an effective multi-view approach to end-to-end learning of precise manipulation tasks that are 3D in nature. Our method learns to accomplish these tasks using multiple statically placed but uncalibrated RGB camera views without building an explicit 3D representation such as a pointcloud or voxel grid.
arXiv Detail & Related papers (2020-02-21T03:28:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.