Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes
- URL: http://arxiv.org/abs/2409.07843v1
- Date: Thu, 12 Sep 2024 08:44:35 GMT
- Title: Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes
- Authors: Ming Li, Xiong Yang, Chaofan Wu, Jiaheng Li, Pinzhi Wang, Xuejiao Hu, Sidan Du, Yang Li,
- Abstract summary: We propose a robotic prototype system and corresponding algorithm designed to validate omnidirectional depth estimation for navigation and obstacle avoidance in real-world scenarios for both robots and vehicles.
We introduce a combined spherical sweeping method and optimize the model architecture for proposed RtHexa- OmniMVS algorithm to achieve real-time omnidirectional depth estimation.
The proposed algorithm demonstrates high accuracy in various complex real-world scenarios, both indoors and outdoors, achieving an inference speed of 15 fps on edge computing platforms.
- Score: 9.073031720400401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Omnidirectional Depth Estimation has broad application prospects in fields such as robotic navigation and autonomous driving. In this paper, we propose a robotic prototype system and corresponding algorithm designed to validate omnidirectional depth estimation for navigation and obstacle avoidance in real-world scenarios for both robots and vehicles. The proposed HexaMODE system captures 360$^\circ$ depth maps using six surrounding arranged fisheye cameras. We introduce a combined spherical sweeping method and optimize the model architecture for proposed RtHexa-OmniMVS algorithm to achieve real-time omnidirectional depth estimation. To ensure high accuracy, robustness, and generalization in real-world environments, we employ a teacher-student self-training strategy, utilizing large-scale unlabeled real-world data for model training. The proposed algorithm demonstrates high accuracy in various complex real-world scenarios, both indoors and outdoors, achieving an inference speed of 15 fps on edge computing platforms.
Related papers
- Enhancing Navigation Benchmarking and Perception Data Generation for
Row-based Crops in Simulation [0.3518016233072556]
This paper presents a synthetic dataset to train semantic segmentation networks and a collection of virtual scenarios for a fast evaluation of navigation algorithms.
An automatic parametric approach is developed to explore different field geometries and features.
The simulation framework and the dataset have been evaluated by training a deep segmentation network on different crops and benchmarking the resulting navigation.
arXiv Detail & Related papers (2023-06-27T14:46:09Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Deterministic and Stochastic Analysis of Deep Reinforcement Learning for
Low Dimensional Sensing-based Navigation of Mobile Robots [0.41562334038629606]
This paper presents a comparative analysis of two Deep-RL techniques - Deep Deterministic Policy Gradients (DDPG) and Soft Actor-Critic (SAC)
We aim to contribute by showing how the neural network architecture influences the learning itself, presenting quantitative results based on the time and distance of aerial mobile robots for each approach.
arXiv Detail & Related papers (2022-09-13T22:28:26Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - High-Speed Robot Navigation using Predicted Occupancy Maps [0.0]
We study algorithmic approaches that allow the robot to predict spaces extending beyond the sensor horizon for robust planning at high speeds.
We accomplish this using a generative neural network trained from real-world data without requiring human annotated labels.
We extend our existing control algorithms to support leveraging the predicted spaces to improve collision-free planning and navigation at high speeds.
arXiv Detail & Related papers (2020-12-22T16:25:12Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.