Real World Robotic Exploration using Deep Neural Networks Trained in Photorealistic Reconstructed Environments
- URL: http://arxiv.org/abs/2509.13342v1
- Date: Fri, 12 Sep 2025 00:03:04 GMT
- Title: Real World Robotic Exploration using Deep Neural Networks Trained in Photorealistic Reconstructed Environments
- Authors: Isaac Ronald Ward,
- Abstract summary: An existing deep neural network approach for determining a robot's pose from visual information (RGB images) is modified.<n>Photogrammetry data is used to produce a pose-labelled dataset which allows the above model to be trained on a local environment.<n>This trained model forms the basis of a navigation algorithm, which is tested in real-time on a TurtleBot.
- Score: 1.3053649021965599
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this work, an existing deep neural network approach for determining a robot's pose from visual information (RGB images) is modified, improving its localization performance without impacting its ease of training. Explicitly, the network's loss function is extended in a manner which intuitively combines the positional and rotational error in order to increase robustness to perceptual aliasing. An improvement in the localization accuracy for indoor scenes is observed: with decreases of up to 9.64% and 2.99% in the median positional and rotational error respectively, when compared to the unmodified network. Additionally, photogrammetry data is used to produce a pose-labelled dataset which allows the above model to be trained on a local environment, resulting in localization accuracies of 0.11m & 0.89 degrees. This trained model forms the basis of a navigation algorithm, which is tested in real-time on a TurtleBot (a wheeled robotic device). As such, this work introduces a full pipeline for creating a robust navigational algorithm for any given real world indoor scene; the only requirement being a collection of images from the scene, which can be captured in as little as 330 seconds of
Related papers
- Self-localization on a 3D map by fusing global and local features from a monocular camera [0.0]
Self-localization based on a camera often uses a convolutional neural network (CNN) that can extract local features that are calculated by nearby pixels.<n>This study proposes a new method combining CNN with Vision Transformer, which excels at extracting global features that show the relationship of patches on whole image.
arXiv Detail & Related papers (2025-10-30T06:14:22Z) - Hierarchical localization with panoramic views and triplet loss functions [2.663377882489275]
The main objective of this paper is to tackle visual localization, which is essential for the safe navigation of mobile robots.
The solution we propose employs panoramic images and triplet convolutional neural networks.
To explore the limits of our approach, triplet networks have been tested in different indoor environments simultaneously.
arXiv Detail & Related papers (2024-04-22T12:07:10Z) - PoseINN: Realtime Visual-based Pose Regression and Localization with Invertible Neural Networks [3.031375888004876]
Estimating ego-pose from cameras is an important problem in robotics with applications ranging from mobile robotics to augmented reality.
We propose to solve the problem by using invertible neural networks (INN) to find the mapping between the latent space of images and poses for a given scene.
Our model achieves similar performance to the SOTA while being faster to train and only requiring offline rendering of low-resolution synthetic data.
arXiv Detail & Related papers (2024-04-20T06:25:32Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - CROSSFIRE: Camera Relocalization On Self-Supervised Features from an
Implicit Representation [3.565151496245487]
We use Neural Radiance Fields as an implicit map of a given scene and propose a camera relocalization tailored for this representation.
The proposed method enables to compute in real-time the precise position of a device using a single RGB camera, during its navigation.
arXiv Detail & Related papers (2023-03-08T20:22:08Z) - Markerless Camera-to-Robot Pose Estimation via Self-supervised
Sim-to-Real Transfer [26.21320177775571]
We propose an end-to-end pose estimation framework that is capable of online camera-to-robot calibration and a self-supervised training method.
Our framework combines deep learning and geometric vision for solving the robot pose, and the pipeline is fully differentiable.
arXiv Detail & Related papers (2023-02-28T05:55:42Z) - On the Application of Efficient Neural Mapping to Real-Time Indoor
Localisation for Unmanned Ground Vehicles [5.137284292672375]
We show that a compact model capable of real-time inference on embedded platforms can be used to achieve localisation accuracy of several centimetres.
We deploy our trained model onboard a UGV platform, demonstrating its effectiveness in a waypoint navigation task.
arXiv Detail & Related papers (2022-11-09T07:23:28Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [64.80458128766254]
iSDF is a continuous learning system for real-time signed distance field reconstruction.
It produces more accurate reconstructions and better approximations of collision costs and gradients.
arXiv Detail & Related papers (2022-04-05T15:48:39Z) - Neural RF SLAM for unsupervised positioning and mapping with channel
state information [51.484516640867525]
We present a neural network architecture for jointly learning user locations and environment mapping up to isometry.
The proposed model learns an interpretable latent, i.e., user location, by just enforcing a physics-based decoder.
arXiv Detail & Related papers (2022-03-15T21:32:44Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - Object-based Illumination Estimation with Rendering-aware Neural
Networks [56.01734918693844]
We present a scheme for fast environment light estimation from the RGBD appearance of individual objects and their local image areas.
With the estimated lighting, virtual objects can be rendered in AR scenarios with shading that is consistent to the real scene.
arXiv Detail & Related papers (2020-08-06T08:23:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.