Related papers: RGB-D based Stair Detection using Deep Learning for Autonomous Stair Climbing

RGB-D based Stair Detection using Deep Learning for Autonomous Stair Climbing

URL: http://arxiv.org/abs/2212.01098v1
Date: Fri, 2 Dec 2022 11:22:52 GMT
Title: RGB-D based Stair Detection using Deep Learning for Autonomous Stair Climbing
Authors: Chen Wang, Zhongcai Pei, Shuang Qiu, Zhiyong Tang
Abstract summary: We propose a neural network architecture with inputs of both RGB map and depth map. Specifically, we design the selective module which can make the network learn the complementary relationship between RGB map and depth map. Experiments on our dataset show that our method can achieve better accuracy and recall compared with the previous state-of-the-art deep learning method.
Score: 6.362951673024623
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stairs are common building structures in urban environment, and stair detection is an important part of environment perception for autonomous mobile robots. Most existing algorithms have difficulty combining the visual information from binocular sensors effectively and ensuring reliable detection at night and in the case of extremely fuzzy visual clues. To solve these problems, we propose a neural network architecture with inputs of both RGB map and depth map. Specifically, we design the selective module which can make the network learn the complementary relationship between RGB map and depth map and effectively combine the information from RGB map and depth map in different scenes. In addition, we also design a line clustering algorithm for the post-processing of detection results, which can make full use of the detection results to obtain the geometric parameters of stairs. Experiments on our dataset show that our method can achieve better accuracy and recall compared with the previous state-of-the-art deep learning method, which are 5.64% and 7.97%, respectively. Our method also has extremely fast detection speed, and a lightweight version can achieve 300 + frames per second with the same resolution, which can meet the needs of most real-time detection scenes.

Related papers

Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline [64.42938561167402]
We propose an online 3D reconstruction method using 3D Gaussian-based SLAM, combined with a feed-forward recurrent prediction module.<n>This approach replaces slow test-time optimization with fast network inference, significantly improving tracking speed.<n>Our method achieves performance on par with the state-of-the-art SplaTAM, while reducing tracking time by more than 90%.
arXiv Detail & Related papers (2025-08-06T16:16:58Z)
Exploring Deep Learning Image Super-Resolution for Iris Recognition [50.43429968821899]
We propose the use of two deep learning single-image super-resolution approaches: Stacked Auto-Encoders (SAE) and Convolutional Neural Networks (CNN) We validate the methods with a database of 1.872 near-infrared iris images with quality assessment and recognition experiments showing the superiority of deep learning approaches over the compared algorithms.
arXiv Detail & Related papers (2023-11-02T13:57:48Z)
Multi-Modal Multi-Task (3MT) Road Segmentation [0.8287206589886879]
We focus on using raw sensor inputs instead of, as it is typically done in many SOTA works, leveraging architectures that require high pre-processing costs. This study presents a cost-effective and highly accurate solution for road segmentation by integrating data from multiple sensors within a multi-task learning architecture.
arXiv Detail & Related papers (2023-08-23T08:15:15Z)
StairNetV3: Depth-aware Stair Modeling using Deep Learning [6.145334325463317]
Vision-based stair perception can help autonomous mobile robots deal with the challenge of climbing stairs. Current monocular vision methods are difficult to model stairs accurately without depth information. This paper proposes a depth-aware stair modeling method for monocular vision.
arXiv Detail & Related papers (2023-08-13T08:11:40Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
BS3D: Building-scale 3D Reconstruction from RGB-D Images [25.604775584883413]
We propose an easy-to-use framework for acquiring building-scale 3D reconstruction using a consumer depth camera. Unlike complex and expensive acquisition setups, our system enables crowd-sourcing, which can greatly benefit data-hungry algorithms.
arXiv Detail & Related papers (2023-01-03T11:46:14Z)
Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars. In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors. We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z)
Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios. We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
Automatic extraction of road intersection points from USGS historical map series using deep convolutional neural networks [0.0]
Road intersections data have been used across different geospatial applications and analysis. We employed the standard paradigm of using deep convolutional neural network for object detection task named region-based CNN. Also, compared to the majority of traditional computer vision algorithms RCNN provides more accurate extraction.
arXiv Detail & Related papers (2020-07-14T23:51:15Z)
UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders [81.5490760424213]
We propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process. Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network.
arXiv Detail & Related papers (2020-04-13T04:12:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.