Indoor Navigation Assistance for Visually Impaired People via Dynamic
SLAM and Panoptic Segmentation with an RGB-D Sensor
- URL: http://arxiv.org/abs/2204.01154v1
- Date: Sun, 3 Apr 2022 20:19:15 GMT
- Title: Indoor Navigation Assistance for Visually Impaired People via Dynamic
SLAM and Panoptic Segmentation with an RGB-D Sensor
- Authors: Wenyan Ou, Jiaming Zhang, Kunyu Peng, Kailun Yang, Gerhard Jaworek,
Karin M\"uller, Rainer Stiefelhagen
- Abstract summary: We propose an assistive system with an RGB-D sensor to detect dynamic information of a scene.
With sparse feature points extracted from images, poses of the user can be estimated.
poses and speed of tracked dynamic objects can be estimated, which are passed to the users through acoustic feedback.
- Score: 25.36354262588248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploring an unfamiliar indoor environment and avoiding obstacles is
challenging for visually impaired people. Currently, several approaches achieve
the avoidance of static obstacles based on the mapping of indoor scenes. To
solve the issue of distinguishing dynamic obstacles, we propose an assistive
system with an RGB-D sensor to detect dynamic information of a scene. Once the
system captures an image, panoptic segmentation is performed to obtain the
prior dynamic object information. With sparse feature points extracted from
images and the depth information, poses of the user can be estimated. After the
ego-motion estimation, the dynamic object can be identified and tracked. Then,
poses and speed of tracked dynamic objects can be estimated, which are passed
to the users through acoustic feedback.
Related papers
- DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments [28.23284296418962]
Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments.
Existing datasets for developing ZSON algorithms lack consideration of dynamic obstacles, object diversity, and scene texts.
We propose a dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments (DOZE)
DOZE comprises ten high-fidelity 3D scenes with over 18k tasks, aiming to mimic complex, dynamic real-world scenarios.
arXiv Detail & Related papers (2024-02-29T10:03:57Z) - Floor extraction and door detection for visually impaired guidance [78.94595951597344]
Finding obstacle-free paths in unknown environments is a big navigation issue for visually impaired people and autonomous robots.
New devices based on computer vision systems can help impaired people to overcome the difficulties of navigating in unknown environments in safe conditions.
In this work it is proposed a combination of sensors and algorithms that can lead to the building of a navigation system for visually impaired people.
arXiv Detail & Related papers (2024-01-30T14:38:43Z) - LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry [52.131996528655094]
We present the Long-term Effective Any Point Tracking (LEAP) module.
LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation.
Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes.
arXiv Detail & Related papers (2024-01-03T18:57:27Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - HIDA: Towards Holistic Indoor Understanding for the Visually Impaired
via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor [25.206941504935685]
HIDA is a lightweight assistive system based on 3D point cloud instance segmentation with a solid-state LiDAR sensor.
Our entire system consists of three hardware components, two interactive functions(obstacle avoidance and object finding) and a voice user interface.
The proposed 3D instance segmentation model has achieved state-of-the-art performance on ScanNet v2 dataset.
arXiv Detail & Related papers (2021-07-07T12:23:53Z) - ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object
Removal for Static 3D Point Cloud Map Building [0.1474723404975345]
This paper presents a novel static map building method called ERASOR, Egocentric RAtio of pSeudo Occupancy-based dynamic object Removal.
Our approach directs its attention to the nature of most dynamic objects in urban environments being inevitably in contact with the ground.
arXiv Detail & Related papers (2021-03-07T10:29:07Z) - Event-based Motion Segmentation with Spatio-Temporal Graph Cuts [51.17064599766138]
We have developed a method to identify independently objects acquired with an event-based camera.
The method performs on par or better than the state of the art without having to predetermine the number of expected moving objects.
arXiv Detail & Related papers (2020-12-16T04:06:02Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Attentional Separation-and-Aggregation Network for Self-supervised
Depth-Pose Learning in Dynamic Scenes [19.704284616226552]
Learning depth and ego-motion from unlabeled videos via self-supervision from epipolar projection can improve the robustness and accuracy of the 3D perception and localization of vision-based robots.
However, the rigid projection computed by ego-motion cannot represent all scene points, such as points on moving objects, leading to false guidance in these regions.
We propose an Attentional Separation-and-Aggregation Network (ASANet) which can learn to distinguish and extract the scene's static and dynamic characteristics via the attention mechanism.
arXiv Detail & Related papers (2020-11-18T16:07:30Z) - DOT: Dynamic Object Tracking for Visual SLAM [83.69544718120167]
DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects.
To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error.
Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
arXiv Detail & Related papers (2020-09-30T18:36:28Z) - Moving object detection for visual odometry in a dynamic environment
based on occlusion accumulation [31.143322364794894]
We propose a moving object detection algorithm that uses RGB-D images.
The proposed algorithm does not require estimating a background model.
We use dense visual odometry (DVO) as a VO method with a bi-square regression weight.
arXiv Detail & Related papers (2020-09-18T11:01:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.