Related papers: MeshVPR: Citywide Visual Place Recognition Using 3D Meshes

MeshVPR: Citywide Visual Place Recognition Using 3D Meshes

URL: http://arxiv.org/abs/2406.02776v2
Date: Wed, 24 Jul 2024 11:48:28 GMT
Title: MeshVPR: Citywide Visual Place Recognition Using 3D Meshes
Authors: Gabriele Berton, Lorenz Junglas, Riccardo Zaccone, Thomas Pollok, Barbara Caputo, Carlo Masone,
Abstract summary: Mesh-based scene representation offers a promising direction for simplifying large-scale hierarchical visual localization pipelines. While existing work demonstrates the viability of meshes for visual localization, the impact of using synthetic databases rendered from them in visual place recognition remains largely unexplored. We propose MeshVPR, a novel VPR pipeline that utilizes a lightweight features alignment framework to bridge the gap between real-world and synthetic domains.
Score: 18.168206222895282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mesh-based scene representation offers a promising direction for simplifying large-scale hierarchical visual localization pipelines, combining a visual place recognition step based on global features (retrieval) and a visual localization step based on local features. While existing work demonstrates the viability of meshes for visual localization, the impact of using synthetic databases rendered from them in visual place recognition remains largely unexplored. In this work we investigate using dense 3D textured meshes for large-scale Visual Place Recognition (VPR). We identify a significant performance drop when using synthetic mesh-based image databases compared to real-world images for retrieval. To address this, we propose MeshVPR, a novel VPR pipeline that utilizes a lightweight features alignment framework to bridge the gap between real-world and synthetic domains. MeshVPR leverages pre-trained VPR models and is efficient and scalable for city-wide deployments. We introduce novel datasets with freely available 3D meshes and manually collected queries from Berlin, Paris, and Melbourne. Extensive evaluations demonstrate that MeshVPR achieves competitive performance with standard VPR pipelines, paving the way for mesh-based localization systems. Data, code, and interactive visualizations are available at https://meshvpr.github.io/

Related papers

vS-Graphs: Integrating Visual SLAM and Situational Graphs through Multi-level Scene Understanding [0.0]
This paper introduces visual S-Graphs (vS-Graphs), a novel real-time VSLAM framework. It integrates vision-based scene understanding with map reconstruction and comprehensible graph-based representation. Experiments on standard benchmarks and real-world datasets demonstrate that vS-Graphs outperforms state-of-the-art VSLAM methods.
arXiv Detail & Related papers (2025-03-03T18:15:11Z)
LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation [5.739362282280063]
LiteVLoc is a visual localization framework that uses a lightweight topo-metric map to represent the environment. It reduces storage overhead by leveraging learning-based feature matching and geometric solvers for metric pose estimation.
arXiv Detail & Related papers (2024-10-06T09:26:07Z)
SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality [50.179377002092416]
We propose an efficient visual localization method capable of high-quality rendering with fewer parameters. Our method achieves superior or comparable rendering and localization performance to state-of-the-art implicit-based visual localization approaches.
arXiv Detail & Related papers (2024-09-21T08:46:16Z)
Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations [8.522160106746478]
We present a global visual localization system capable of localizing a single camera image across various 3D map representations. Our system generates a database by synthesizing novel views of the scene, creating RGB and depth image pairs. NeRF synthesized images show superior performance, localizing query images at an average success rate of 72%.
arXiv Detail & Related papers (2024-08-21T19:37:17Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
MeshLoc: Mesh-Based Visual Localization [54.731309449883284]
We explore a more flexible alternative based on dense 3D meshes that does not require features matching between database images to build the scene representation. Surprisingly competitive results can be obtained when extracting features on renderings of these meshes, without any neural rendering stage. Our results show that dense 3D model-based representations are a promising alternative to existing representations and point to interesting and challenging directions for future research.
arXiv Detail & Related papers (2022-07-21T21:21:10Z)
TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments. TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z)
H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [4.263987603222371]
This paper introduces a 3D dataset which is unique in three ways. It depicts the village of Hessigheim (Germany) henceforth referred to as H3D. It is designed for promoting research in the field of 3D data analysis on one hand and to evaluate and rank emerging approaches.
arXiv Detail & Related papers (2021-02-10T09:33:48Z)
Robust Image Retrieval-based Visual Localization using Kapture [10.249293519246478]
We present a versatile pipeline for visual localization that facilitates the use of different local and global features. We evaluate our methods on eight public datasets where they rank top on all and first on many of them. To foster future research, we release code, models, and all datasets used in this paper in the kapture format open source under a permissive BSD license.
arXiv Detail & Related papers (2020-07-27T21:10:35Z)
DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points. For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner. Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.