PoseINN: Realtime Visual-based Pose Regression and Localization with Invertible Neural Networks
- URL: http://arxiv.org/abs/2404.13288v3
- Date: Tue, 7 May 2024 14:56:00 GMT
- Title: PoseINN: Realtime Visual-based Pose Regression and Localization with Invertible Neural Networks
- Authors: Zirui Zang, Ahmad Amine, Rahul Mangharam,
- Abstract summary: Estimating ego-pose from cameras is an important problem in robotics with applications ranging from mobile robotics to augmented reality.
We propose to solve the problem by using invertible neural networks (INN) to find the mapping between the latent space of images and poses for a given scene.
Our model achieves similar performance to the SOTA while being faster to train and only requiring offline rendering of low-resolution synthetic data.
- Score: 3.031375888004876
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Estimating ego-pose from cameras is an important problem in robotics with applications ranging from mobile robotics to augmented reality. While SOTA models are becoming increasingly accurate, they can still be unwieldy due to high computational costs. In this paper, we propose to solve the problem by using invertible neural networks (INN) to find the mapping between the latent space of images and poses for a given scene. Our model achieves similar performance to the SOTA while being faster to train and only requiring offline rendering of low-resolution synthetic data. By using normalizing flows, the proposed method also provides uncertainty estimation for the output. We also demonstrated the efficiency of this method by deploying the model on a mobile robot.
Related papers
- Neural Potential Field for Obstacle-Aware Local Motion Planning [46.42871544295734]
We propose a neural network model that returns a differentiable collision cost based on robot pose, obstacle map, and robot footprint.
Our architecture includes neural image encoders, which transform obstacle maps and robot footprints into embeddings.
Experiment on Husky UGV mobile robot showed that our approach allows real-time and safe local planning.
arXiv Detail & Related papers (2023-10-25T05:00:21Z) - High-Degrees-of-Freedom Dynamic Neural Fields for Robot Self-Modeling and Motion Planning [6.229216953398305]
A robot self-model is a representation of the robot's physical morphology that can be used for motion planning tasks.
We propose a new encoder-based neural density field architecture for dynamic object-centric scenes conditioned on high numbers of degrees of freedom.
In a 7-DOF robot test setup, the learned self-model achieves a Chamfer-L2 distance of 2% of the robot's dimension workspace.
arXiv Detail & Related papers (2023-10-05T16:01:29Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Markerless Camera-to-Robot Pose Estimation via Self-supervised
Sim-to-Real Transfer [26.21320177775571]
We propose an end-to-end pose estimation framework that is capable of online camera-to-robot calibration and a self-supervised training method.
Our framework combines deep learning and geometric vision for solving the robot pose, and the pipeline is fully differentiable.
arXiv Detail & Related papers (2023-02-28T05:55:42Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Making DensePose fast and light [78.49552144907513]
Existing neural network models capable of solving this task are heavily parameterized.
To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection.
In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast.
arXiv Detail & Related papers (2020-06-26T19:42:20Z) - Hyperparameters optimization for Deep Learning based emotion prediction
for Human Robot Interaction [0.2549905572365809]
We have proposed an Inception module based Convolutional Neural Network Architecture.
The model is implemented in a humanoid robot, NAO in real time and robustness of the model is evaluated.
arXiv Detail & Related papers (2020-01-12T05:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.