Hierarchical localization with panoramic views and triplet loss functions
- URL: http://arxiv.org/abs/2404.14117v2
- Date: Fri, 22 Nov 2024 15:51:52 GMT
- Title: Hierarchical localization with panoramic views and triplet loss functions
- Authors: Marcos Alfaro, Juan José Cabrera, María Flores, Óscar Reinoso, Luis Payá,
- Abstract summary: The main objective of this paper is to tackle visual localization, which is essential for the safe navigation of mobile robots.
The solution we propose employs panoramic images and triplet convolutional neural networks.
To explore the limits of our approach, triplet networks have been tested in different indoor environments simultaneously.
- Score: 2.663377882489275
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The main objective of this paper is to tackle visual localization, which is essential for the safe navigation of mobile robots. The solution we propose employs panoramic images and triplet convolutional neural networks. We seek to exploit the properties of such architectures to address both hierarchical and global localization in indoor environments, which are prone to visual aliasing and other phenomena. Considering their importance in these architectures, a complete comparative evaluation of different triplet loss functions is performed. The experimental section proves that triplet networks can be trained with a relatively low number of images captured under a specific lighting condition and even so, the resulting networks are a robust tool to perform visual localization under dynamic conditions. Our approach has been evaluated against some of these effects, such as changes in the lighting conditions, occlusions, noise and motion blurring. Furthermore, to explore the limits of our approach, triplet networks have been tested in different indoor environments simultaneously. In all the cases, these architectures have demonstrated a great capability to generalize to diverse and challenging scenarios. The code used in the experiments is available at https://github.com/MarcosAlfaro/TripletNetworksIndoorLocalization.git.
Related papers
- Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations [8.522160106746478]
We present a global visual localization system capable of localizing a single camera image across various 3D map representations.
Our system generates a database by synthesizing novel views of the scene, creating RGB and depth image pairs.
NeRF synthesized images show superior performance, localizing query images at an average success rate of 72%.
arXiv Detail & Related papers (2024-08-21T19:37:17Z) - An experimental evaluation of Siamese Neural Networks for robot localization using omnidirectional imaging in indoor environments [1.0485739694839669]
This paper addresses the localization problem using omnidirectional images captured by a catadioptric vision system mounted on the robot.
We explore the potential of Siamese Neural Networks for modeling indoor environments using panoramic images as the unique source of information.
arXiv Detail & Related papers (2024-07-15T08:44:37Z) - Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis [0.38366697175402226]
Pooling layers overlook important information encoded in the spatial arrangement of pixel intensity and/or feature values.
We propose a novel lacunarity pooling layer that aims to capture the spatial heterogeneity of the feature maps by evaluating the variability within local windows.
The lacunarity pooling layer can be seamlessly integrated into any artificial neural network architecture.
arXiv Detail & Related papers (2024-04-25T00:34:52Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Self-Supervised Feature Learning for Long-Term Metric Visual
Localization [16.987148593917905]
We present a novel self-supervised feature learning framework for metric visual localization.
We use a sequence-based image matching algorithm to generate image correspondences without ground-truth labels.
We can then sample image pairs to train a deep neural network that learns sparse features with associated descriptors and scores without ground-truth pose supervision.
arXiv Detail & Related papers (2022-11-30T21:15:05Z) - Supervised Fine-tuning Evaluation for Long-term Visual Place Recognition [14.632777952261716]
We present a comprehensive study on the utility of deep convolutional neural networks with two state-of-the-art pooling layers.
We compare deep learned global features with three different loss functions, e.g. triplet, contrastive and ArcFace, for learning the parameters of the architectures.
Our investigation demonstrates that fine tuning architectures with ArcFace loss in an end-to-end manner outperforms other two losses by approximately 14% in outdoor and 12% in indoor datasets.
arXiv Detail & Related papers (2022-11-14T19:16:21Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - FuNNscope: Visual microscope for interactively exploring the loss
landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks.
We generalize observations on small neural networks to more complex systems.
An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z) - Stereoscopic Universal Perturbations across Different Architectures and
Datasets [60.021985610201156]
We study the effect of adversarial perturbations of images on deep stereo matching networks for the disparity estimation task.
We present a method to craft a single set of perturbations that, when added to any stereo image pair in a dataset, can fool a stereo network.
Our perturbations can increase D1-error (akin to fooling rate) of state-of-the-art stereo networks from 1% to as much as 87%.
arXiv Detail & Related papers (2021-12-12T02:11:31Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.