Pose Refinement with Joint Optimization of Visual Points and Lines
- URL: http://arxiv.org/abs/2110.03940v1
- Date: Fri, 8 Oct 2021 07:22:51 GMT
- Title: Pose Refinement with Joint Optimization of Visual Points and Lines
- Authors: Shuang Gao, Jixiang Wan, Yishan Ping, Xudong Zhang, Shuzhou Dong,
Jijunnan Li, Yandong Guo
- Abstract summary: We propose a point-line joint optimization method for pose refinement with the help of the innovatively designed line extracting CNN named VLSE.
In this paper, we adopt a novel line representation and customize a hybrid convolutional block based on the Stacked Hourglass network.
A following point-line joint cost function is constructed to optimize the camera pose with the initial coarse pose.
- Score: 6.780018205514503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-precision camera re-localization technology in a pre-established 3D
environment map is the basis for many tasks, such as Augmented Reality,
Robotics and Autonomous Driving. The point-based visual re-localization
approaches are well-developed in recent decades, but are insufficient in some
feature-less cases. In this paper, we propose a point-line joint optimization
method for pose refinement with the help of the innovatively designed line
extracting CNN named VLSE, and the line matching and pose optimization
approach. We adopt a novel line representation and customize a hybrid
convolutional block based on the Stacked Hourglass network, to detect accurate
and stable line features on images. Then we apply a coarse-to-fine strategy to
obtain precise 2D-3D line correspondences based on the geometric constraint. A
following point-line joint cost function is constructed to optimize the camera
pose with the initial coarse pose. Sufficient experiments are conducted on open
datasets, i.e, line extractor on Wireframe and YorkUrban, localization
performance on Aachen Day-Night v1.1 and InLoc, to confirm the effectiveness of
our point-line joint pose optimization method.
Related papers
- Preference Score Distillation: Leveraging 2D Rewards to Align Text-to-3D Generation with Human Preference [69.34278282513593]
Preference Score Distillation (PSD) is an optimization-based framework for human-aligned text-to-3D synthesis without 3D training data.<n>Our key insight stems from the incompatibility of pixel-level gradients.<n>We introduce an adaptive strategy to co-optimize preference scores and negative text embeddings.
arXiv Detail & Related papers (2026-03-02T08:23:36Z) - JOGS: Joint Optimization of Pose Estimation and 3D Gaussian Splatting [10.35563602148445]
We propose a unified framework that jointly optimize 3D Gaussian points and camera poses without requiring pre-calibrated inputs.<n>Our approach iteratively refines 3D Gaussian parameters and updates camera poses through a novel co-optimization strategy.<n>Our approach significantly outperforms existing COLMAP-free techniques in reconstruction quality, and also surpasses the standard COLMAP-based baseline in general.
arXiv Detail & Related papers (2025-10-30T04:00:07Z) - Adaptive Point-Prompt Tuning: Fine-Tuning Heterogeneous Foundation Models for 3D Point Cloud Analysis [51.37795317716487]
We propose the Adaptive Point-Prompt Tuning (APPT) method, which fine-tunes pre-trained models with a modest number of parameters.<n>We convert raw point clouds into point embeddings by aggregating local geometry to capture spatial features followed by linear layers.<n>To calibrate self-attention across source domains of any modality to 3D, we introduce a prompt generator that shares weights with the point embedding module.
arXiv Detail & Related papers (2025-08-30T06:02:21Z) - RiemanLine: Riemannian Manifold Representation of 3D Lines for Factor Graph Optimization [49.83974390433746]
This paper introduces textbfRiemanLine, a unified minimal representation for 3D lines.<n>Our key idea is to decouple each line landmark into global and local components.<n>Experiments on ICL-NUIM, TartanAir, and synthetic benchmarks demonstrate that our method achieves significantly more accurate pose estimation and line reconstruction.
arXiv Detail & Related papers (2025-08-06T11:27:38Z) - Bridging Geometry-Coherent Text-to-3D Generation with Multi-View Diffusion Priors and Gaussian Splatting [51.08718483081347]
We propose a framework that couples multi-view joint distribution priors to ensure geometrically consistent 3D generation.<n>We derive an effective optimization rule that effectively couples multi-view priors to guide optimization across different viewpoints.<n>We employ a deformable tetrahedral grid, from 3D-GS and refined through CSD, to produce high-quality, refined meshes.
arXiv Detail & Related papers (2025-05-07T09:12:45Z) - A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds [37.043012716944496]
We introduce a constrained optimization method for simultaneous camera pose estimation and 3D reconstruction.
Experiments demonstrate that the proposed method significantly outperforms the existing (multi-modal) 3DGS baseline.
arXiv Detail & Related papers (2025-04-12T08:34:43Z) - Multiview Image-Based Localization [2.594420805049218]
This paper represents a hybrid approach that stores only image features in the database like some IR methods.
It relies on a latent 3D reconstruction, like 3D methods but without retaining a 3D scene reconstruction.
Our approach shows improved performance on the 7-Scenes and Cambridge Landmarks datasets while also improving on timing and memory footprint as compared to state-of-the-art.
arXiv Detail & Related papers (2025-03-30T20:00:31Z) - HGSLoc: 3DGS-based Heuristic Camera Pose Refinement [13.393035855468428]
Visual localization refers to the process of determining camera poses and orientation within a known scene representation.
In this paper, we propose HGSLoc, which integrates 3D reconstruction with a refinement strategy to achieve higher pose estimation accuracy.
Our method demonstrates a faster rendering speed and higher localization accuracy compared to NeRF-based neural rendering approaches.
arXiv Detail & Related papers (2024-09-17T06:48:48Z) - SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization [16.460851701725392]
We present a novel approach that optimize radiance fields with scene graphs to mitigate the influence of outlier poses.
Our method incorporates an adaptive inlier-outlier confidence estimation scheme based on scene graphs.
We also introduce an effective intersection-over-union (IoU) loss to optimize the camera pose and surface geometry.
arXiv Detail & Related papers (2024-07-17T15:50:17Z) - OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts.
Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module.
Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z) - InstantSplat: Sparse-view Gaussian Splatting in Seconds [91.77050739918037]
We introduce InstantSplat, a novel approach for addressing sparse-view 3D scene reconstruction at lightning-fast speed.
InstantSplat employs a self-supervised framework that optimize 3D scene representation and camera poses.
It achieves an acceleration of over 30x in reconstruction and improves visual quality (SSIM) from 0.3755 to 0.7624 compared to traditional SfM with 3D-GS.
arXiv Detail & Related papers (2024-03-29T17:29:58Z) - Vanishing Point Estimation in Uncalibrated Images with Prior Gravity
Direction [82.72686460985297]
We tackle the problem of estimating a Manhattan frame.
We derive two new 2-line solvers, one of which does not suffer from singularities affecting existing solvers.
We also design a new non-minimal method, running on an arbitrary number of lines, to boost the performance in local optimization.
arXiv Detail & Related papers (2023-08-21T13:03:25Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Towards Scalable Multi-View Reconstruction of Geometry and Materials [27.660389147094715]
We propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes.
The input are high-resolution RGBD images captured by a mobile, hand-held capture system with point lights for active illumination.
arXiv Detail & Related papers (2023-06-06T15:07:39Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Adaptive Joint Optimization for 3D Reconstruction with Differentiable
Rendering [22.2095090385119]
Given an imperfect reconstructed 3D model, most previous methods have focused on the refinement of either geometry, texture, or camera pose.
We propose a novel optimization approach based on differentiable rendering, which integrates the optimization of camera pose, geometry, and texture into a unified framework.
Using differentiable rendering, an image-level adversarial loss is applied to further improve the 3D model, making it more photorealistic.
arXiv Detail & Related papers (2022-08-15T04:32:41Z) - Multi-initialization Optimization Network for Accurate 3D Human Pose and
Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION)
In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample.
In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.
Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z) - Riggable 3D Face Reconstruction via In-Network Optimization [58.016067611038046]
This paper presents a method for riggable 3D face reconstruction from monocular images.
It jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations.
Experiments demonstrate that our method achieves SOTA reconstruction accuracy, reasonable robustness and generalization ability.
arXiv Detail & Related papers (2021-04-08T03:53:20Z) - Deep-3DAligner: Unsupervised 3D Point Set Registration Network With
Optimizable Latent Vector [15.900382629390297]
We propose to develop a novel model that integrates the optimization to learning, aiming to address the technical challenges in 3D registration.
In addition to the deep transformation decoding network, our framework introduce an optimizable deep underlineSpatial underlineCorrelation underlineRepresentation.
arXiv Detail & Related papers (2020-09-29T22:44:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.