Related papers: Object Depth and Size Estimation using Stereo-vision and Integration with SLAM

Object Depth and Size Estimation using Stereo-vision and Integration with SLAM

URL: http://arxiv.org/abs/2409.07623v1
Date: Wed, 11 Sep 2024 21:12:48 GMT
Title: Object Depth and Size Estimation using Stereo-vision and Integration with SLAM
Authors: Layth Hamad, Muhammad Asif Khan, Amr Mohamed,
Abstract summary: We propose a highly accurate stereo-vision approach to complement LiDAR in autonomous robots. The system employs advanced stereo vision-based object detection to detect both tangible and non-tangible objects. The depth and size information is then integrated into the SLAM process to enhance the robot's navigation capabilities in complex environments.
Score: 2.122581579741322
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autonomous robots use simultaneous localization and mapping (SLAM) for efficient and safe navigation in various environments. LiDAR sensors are integral in these systems for object identification and localization. However, LiDAR systems though effective in detecting solid objects (e.g., trash bin, bottle, etc.), encounter limitations in identifying semitransparent or non-tangible objects (e.g., fire, smoke, steam, etc.) due to poor reflecting characteristics. Additionally, LiDAR also fails to detect features such as navigation signs and often struggles to detect certain hazardous materials that lack a distinct surface for effective laser reflection. In this paper, we propose a highly accurate stereo-vision approach to complement LiDAR in autonomous robots. The system employs advanced stereo vision-based object detection to detect both tangible and non-tangible objects and then uses simple machine learning to precisely estimate the depth and size of the object. The depth and size information is then integrated into the SLAM process to enhance the robot's navigation capabilities in complex environments. Our evaluation, conducted on an autonomous robot equipped with LiDAR and stereo-vision systems demonstrates high accuracy in the estimation of an object's depth and size. A video illustration of the proposed scheme is available at: \url{https://www.youtube.com/watch?v=nusI6tA9eSk}.

Related papers

NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
Efficient Real-time Smoke Filtration with 3D LiDAR for Search and Rescue with Autonomous Heterogeneous Robotic Systems [56.838297900091426]
Smoke and dust affect the performance of any mobile robotic platform due to their reliance on onboard perception systems. This paper proposes a novel modular computation filtration pipeline based on intensity and spatial information.
arXiv Detail & Related papers (2023-08-14T16:48:57Z)
Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents [43.137917788594926]
We propose a tightly-coupled LiDAR-visual SLAM based on geometric features. The entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem. Our system achieves more accurate and robust pose estimation compared to current state-of-the-art multi-modal methods.
arXiv Detail & Related papers (2023-07-15T10:06:43Z)
Performance Study of YOLOv5 and Faster R-CNN for Autonomous Navigation around Non-Cooperative Targets [0.0]
This paper discusses how the combination of cameras and machine learning algorithms can achieve the relative navigation task. The performance of two deep learning-based object detection algorithms, Faster Region-based Convolutional Neural Networks (R-CNN) and You Only Look Once (YOLOv5) is tested. The paper discusses the path to implementing the feature recognition algorithms and towards integrating them into the spacecraft Guidance Navigation and Control system.
arXiv Detail & Related papers (2023-01-22T04:53:38Z)
LiDAR-guided object search and detection in Subterranean Environments [12.265807098187297]
This work utilizes the complementary nature of vision and depth sensors to leverage multi-modal information to aid object detection at longer distances. The proposed work has been thoroughly verified using an ANYmal quadruped robot in underground settings and on datasets collected during the DARPA Subterranean Challenge finals.
arXiv Detail & Related papers (2022-10-26T19:38:19Z)
Comparative study of 3D object detection frameworks based on LiDAR data and sensor fusion techniques [0.0]
The perception system plays a significant role in providing an accurate interpretation of a vehicle's environment in real-time. Deep learning techniques transform the huge amount of data from the sensors into semantic information. 3D object detection methods, by utilizing the additional pose data from the sensors such as LiDARs, stereo cameras, provides information on the size and location of the object.
arXiv Detail & Related papers (2022-02-05T09:34:58Z)
Self-Supervised Depth Completion for Active Stereo [55.79929735390945]
Active stereo systems are widely used in the robotics industry due to their low cost and high quality depth maps. These depth sensors suffer from stereo artefacts and do not provide dense depth estimates. We present the first self-supervised depth completion method for active stereo systems that predicts accurate dense depth maps.
arXiv Detail & Related papers (2021-10-07T07:33:52Z)
Large-scale Autonomous Flight with Real-time Semantic SLAM under Dense Forest Canopy [48.51396198176273]
We propose an integrated system that can perform large-scale autonomous flights and real-time semantic mapping in challenging under-canopy environments. We detect and model tree trunks and ground planes from LiDAR data, which are associated across scans and used to constrain robot poses as well as tree trunk models. A drift-compensation mechanism is designed to minimize the odometry drift using semantic SLAM outputs in real time, while maintaining planner optimality and controller stability.
arXiv Detail & Related papers (2021-09-14T07:24:53Z)
Domain and Modality Gaps for LiDAR-based Person Detection on Mobile Robots [91.01747068273666]
This paper studies existing LiDAR-based person detectors with a particular focus on mobile robot scenarios. Experiments revolve around the domain gap between driving and mobile robot scenarios, as well as the modality gap between 3D and 2D LiDAR sensors. Results provide practical insights into LiDAR-based person detection and facilitate informed decisions for relevant mobile robot designs and applications.
arXiv Detail & Related papers (2021-06-21T16:35:49Z)
High-level camera-LiDAR fusion for 3D object detection with machine learning [0.0]
This paper tackles the 3D object detection problem, which is of vital importance for applications such as autonomous driving. It uses a Machine Learning pipeline on a combination of monocular camera and LiDAR data to detect vehicles in the surrounding 3D space of a moving platform. Our results demonstrate an efficient and accurate inference on a validation set, achieving an overall accuracy of 87.1%.
arXiv Detail & Related papers (2021-05-24T01:57:34Z)
PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space. We propose a model that unifies these two tasks in the same metric space. Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.