Related papers: StairNetV3: Depth-aware Stair Modeling using Deep Learning

StairNetV3: Depth-aware Stair Modeling using Deep Learning

URL: http://arxiv.org/abs/2308.06715v1
Date: Sun, 13 Aug 2023 08:11:40 GMT
Title: StairNetV3: Depth-aware Stair Modeling using Deep Learning
Authors: Chen Wang, Zhongcai Pei, Shuang Qiu, Yachun Wang, Zhiyong Tang
Abstract summary: Vision-based stair perception can help autonomous mobile robots deal with the challenge of climbing stairs. Current monocular vision methods are difficult to model stairs accurately without depth information. This paper proposes a depth-aware stair modeling method for monocular vision.
Score: 6.145334325463317
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision-based stair perception can help autonomous mobile robots deal with the challenge of climbing stairs, especially in unfamiliar environments. To address the problem that current monocular vision methods are difficult to model stairs accurately without depth information, this paper proposes a depth-aware stair modeling method for monocular vision. Specifically, we take the extraction of stair geometric features and the prediction of depth images as joint tasks in a convolutional neural network (CNN), with the designed information propagation architecture, we can achieve effective supervision for stair geometric feature learning by depth information. In addition, to complete the stair modeling, we take the convex lines, concave lines, tread surfaces and riser surfaces as stair geometric features and apply Gaussian kernels to enable the network to predict contextual information within the stair lines. Combined with the depth information obtained by depth sensors, we propose a stair point cloud reconstruction method that can quickly get point clouds belonging to the stair step surfaces. Experiments on our dataset show that our method has a significant improvement over the previous best monocular vision method, with an intersection over union (IOU) increase of 3.4 %, and the lightweight version has a fast detection speed and can meet the requirements of most real-time applications. Our dataset is available at https://data.mendeley.com/datasets/6kffmjt7g2/1.

Related papers

Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline [64.42938561167402]
We propose an online 3D reconstruction method using 3D Gaussian-based SLAM, combined with a feed-forward recurrent prediction module.<n>This approach replaces slow test-time optimization with fast network inference, significantly improving tracking speed.<n>Our method achieves performance on par with the state-of-the-art SplaTAM, while reducing tracking time by more than 90%.
arXiv Detail & Related papers (2025-08-06T16:16:58Z)
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [118.74385965694694]
We present Motion DUSt3R (MonST3R), a novel geometry-first approach that directly estimates per-timestep geometry from dynamic scenes. By simply estimating a pointmap for each timestep, we can effectively adapt DUST3R's representation, previously only used for static scenes, to dynamic scenes. We show that by posing the problem as a fine-tuning task, identifying several suitable datasets, and strategically training the model on this limited data, we can surprisingly enable the model to handle dynamics.
arXiv Detail & Related papers (2024-10-04T18:00:07Z)
StairNet: Visual Recognition of Stairs for Human-Robot Locomotion [2.3811618212533663]
StairNet is an initiative to support the development of new deep learning models for visual sensing and recognition of stairs. We present an overview of the development of our large-scale dataset with over 515,000 manually labeled images. We show that StairNet can be an effective platform to develop and study new visual perception systems for human-robot locomotion.
arXiv Detail & Related papers (2023-10-31T17:30:57Z)
Semi-supervised Learning from Street-View Images and OpenStreetMap for Automatic Building Height Estimation [59.6553058160943]
We propose a semi-supervised learning (SSL) method of automatically estimating building height from Mapillary SVI and OpenStreetMap data. The proposed method leads to a clear performance boosting in estimating building heights with a Mean Absolute Error (MAE) around 2.1 meters. The preliminary result is promising and motivates our future work in scaling up the proposed method based on low-cost VGI data.
arXiv Detail & Related papers (2023-07-05T18:16:30Z)
RGB-D based Stair Detection using Deep Learning for Autonomous Stair Climbing [6.362951673024623]
We propose a neural network architecture with inputs of both RGB map and depth map. Specifically, we design the selective module which can make the network learn the complementary relationship between RGB map and depth map. Experiments on our dataset show that our method can achieve better accuracy and recall compared with the previous state-of-the-art deep learning method.
arXiv Detail & Related papers (2022-12-02T11:22:52Z)
GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion. In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning. Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z)
Visual Attention-based Self-supervised Absolute Depth Estimation using Geometric Priors in Autonomous Driving [8.045833295463094]
We introduce a fully Visual Attention-based Depth (VADepth) network, where spatial attention and channel attention are applied to all stages. By continuously extracting the dependencies of features along the spatial and channel dimensions over a long distance, VADepth network can effectively preserve important details. Experimental results on the KITTI dataset show that this architecture achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-05-18T08:01:38Z)
SelfTune: Metrically Scaled Monocular Depth Estimation through Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation. Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z)
Deep Leaning-Based Ultra-Fast Stair Detection [6.362951673024623]
We propose an end-to-end method for stair line detection based on deep learning. In experiments, our method can achieve high performance in terms of both speed and accuracy. A lightweight version can even achieve 300+ frames per second with the same resolution.
arXiv Detail & Related papers (2022-01-14T02:05:01Z)
Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection. Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised. Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z)
GDRNPP: A Geometry-guided and Fully Learning-based Object Pose Estimator [51.89441403642665]
6D pose estimation of rigid objects is a long-standing and challenging task in computer vision. Recently, the emergence of deep learning reveals the potential of Convolutional Neural Networks (CNNs) to predict reliable 6D poses. This paper introduces a fully learning-based object pose estimator.
arXiv Detail & Related papers (2021-02-24T09:11:31Z)
Height estimation from single aerial images using a deep ordinal regression network [12.991266182762597]
We deal with the ambiguous and unsolved problem of height estimation from a single aerial image. Driven by the success of deep learning, especially deep convolution neural networks (CNNs), some researches have proposed to estimate height information from a single aerial image. In this paper, we proposed to divide height values into spacing-increasing intervals and transform the regression problem into an ordinal regression problem.
arXiv Detail & Related papers (2020-06-04T12:03:51Z)
Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras. We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points. Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.