Related papers: Hierarchical localization with panoramic views and triplet loss functions

Hierarchical localization with panoramic views and triplet loss functions

URL: http://arxiv.org/abs/2404.14117v1
Date: Mon, 22 Apr 2024 12:07:10 GMT
Title: Hierarchical localization with panoramic views and triplet loss functions
Authors: Marcos Alfaro, Juan José Cabrera, Luis Miguel Jiménez, Óscar Reinoso, Luis Payá,
Abstract summary: The main objective of this paper is to address the mobile robot localization problem with Triplet Convolutional Neural Networks. We have used omnidirectional images from real indoor environments captured in dynamic conditions that have been converted to panoramic format. The experimental section proves that triplet neural networks are an efficient and robust tool to address the localization of mobile robots in indoor environments.
Score: 1.8804426519412472
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The main objective of this paper is to address the mobile robot localization problem with Triplet Convolutional Neural Networks and test their robustness against changes of the lighting conditions. We have used omnidirectional images from real indoor environments captured in dynamic conditions that have been converted to panoramic format. Two approaches are proposed to address localization by means of triplet neural networks. First, hierarchical localization, which consists in estimating the robot position in two stages: a coarse localization, which involves a room retrieval task, and a fine localization is addressed by means of image retrieval in the previously selected room. Second, global localization, which consists in estimating the position of the robot inside the entire map in a unique step. Besides, an exhaustive study of the loss function influence on the network learning process has been made. The experimental section proves that triplet neural networks are an efficient and robust tool to address the localization of mobile robots in indoor environments, considering real operation conditions.

Related papers

Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations [8.522160106746478]
We present a global visual localization system capable of localizing a single camera image across various 3D map representations. Our system generates a database by synthesizing novel views of the scene, creating RGB and depth image pairs. NeRF synthesized images show superior performance, localizing query images at an average success rate of 72%.
arXiv Detail & Related papers (2024-08-21T19:37:17Z)
An experimental evaluation of Siamese Neural Networks for robot localization using omnidirectional imaging in indoor environments [1.0485739694839669]
This paper addresses the localization problem using omnidirectional images captured by a catadioptric vision system mounted on the robot. We explore the potential of Siamese Neural Networks for modeling indoor environments using panoramic images as the unique source of information.
arXiv Detail & Related papers (2024-07-15T08:44:37Z)
Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis [0.38366697175402226]
Pooling layers overlook important information encoded in the spatial arrangement of pixel intensity and/or feature values. We propose a novel lacunarity pooling layer that aims to capture the spatial heterogeneity of the feature maps by evaluating the variability within local windows. The lacunarity pooling layer can be seamlessly integrated into any artificial neural network architecture.
arXiv Detail & Related papers (2024-04-25T00:34:52Z)
ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios. We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out. Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Self-Supervised Feature Learning for Long-Term Metric Visual Localization [16.987148593917905]
We present a novel self-supervised feature learning framework for metric visual localization. We use a sequence-based image matching algorithm to generate image correspondences without ground-truth labels. We can then sample image pairs to train a deep neural network that learns sparse features with associated descriptors and scores without ground-truth pose supervision.
arXiv Detail & Related papers (2022-11-30T21:15:05Z)
Supervised Fine-tuning Evaluation for Long-term Visual Place Recognition [14.632777952261716]
We present a comprehensive study on the utility of deep convolutional neural networks with two state-of-the-art pooling layers. We compare deep learned global features with three different loss functions, e.g. triplet, contrastive and ArcFace, for learning the parameters of the architectures. Our investigation demonstrates that fine tuning architectures with ArcFace loss in an end-to-end manner outperforms other two losses by approximately 14% in outdoor and 12% in indoor datasets.
arXiv Detail & Related papers (2022-11-14T19:16:21Z)
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation. To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering. Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z)
FuNNscope: Visual microscope for interactively exploring the loss landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks. We generalize observations on small neural networks to more complex systems. An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z)
Stereoscopic Universal Perturbations across Different Architectures and Datasets [60.021985610201156]
We study the effect of adversarial perturbations of images on deep stereo matching networks for the disparity estimation task. We present a method to craft a single set of perturbations that, when added to any stereo image pair in a dataset, can fool a stereo network. Our perturbations can increase D1-error (akin to fooling rate) of state-of-the-art stereo networks from 1% to as much as 87%.
arXiv Detail & Related papers (2021-12-12T02:11:31Z)
Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences. We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration. We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.