NaVIP: An Image-Centric Indoor Navigation Solution for Visually Impaired People
- URL: http://arxiv.org/abs/2410.18109v1
- Date: Tue, 08 Oct 2024 21:16:50 GMT
- Title: NaVIP: An Image-Centric Indoor Navigation Solution for Visually Impaired People
- Authors: Jun Yu, Yifan Zhang, Badrinadh Aila, Vinod Namboodiri,
- Abstract summary: NaVIP aims to create an image-centric indoor navigation and exploration solution for inclusiveness.
We start by curating large-scale phone camera data in a four-floor research building, with 300K images.
Every image is labelled with precise 6DoF camera poses, details of indoor PoIs, and descriptive captions.
- Score: 12.230718190579724
- License:
- Abstract: Indoor navigation is challenging due to the absence of satellite positioning. This challenge is manifold greater for Visually Impaired People (VIPs) who lack the ability to get information from wayfinding signage. Other sensor signals (e.g., Bluetooth and LiDAR) can be used to create turn-by-turn navigation solutions with position updates for users. Unfortunately, these solutions require tags to be installed all around the environment or the use of fairly expensive hardware. Moreover, these solutions require a high degree of manual involvement that raises costs, thus hampering scalability. We propose an image dataset and associated image-centric solution called NaVIP towards visual intelligence that is infrastructure-free and task-scalable, and can assist VIPs in understanding their surroundings. Specifically, we start by curating large-scale phone camera data in a four-floor research building, with 300K images, to lay the foundation for creating an image-centric indoor navigation and exploration solution for inclusiveness. Every image is labelled with precise 6DoF camera poses, details of indoor PoIs, and descriptive captions to assist VIPs. We benchmark on two main aspects: 1) positioning system and 2) exploration support, prioritizing training scalability and real-time inference, to validate the prospect of image-based solution towards indoor navigation. The dataset, code, and model checkpoints are made publicly available at https://github.com/junfish/VIP_Navi.
Related papers
- Floor extraction and door detection for visually impaired guidance [78.94595951597344]
Finding obstacle-free paths in unknown environments is a big navigation issue for visually impaired people and autonomous robots.
New devices based on computer vision systems can help impaired people to overcome the difficulties of navigating in unknown environments in safe conditions.
In this work it is proposed a combination of sensors and algorithms that can lead to the building of a navigation system for visually impaired people.
arXiv Detail & Related papers (2024-01-30T14:38:43Z) - Long-term Visual Localization with Mobile Sensors [30.839849072256435]
We propose to leverage additional sensors on a mobile phone, mainly GPS, compass, and gravity sensor, to solve this challenging problem.
With the initial pose, we are also able to devise a direct 2D-3D matching network to efficiently establish 2D-3D correspondences.
We benchmark our method as well as several state-of-the-art baselines and demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2023-04-16T04:35:10Z) - DRISHTI: Visual Navigation Assistant for Visually Impaired [0.0]
Blind and visually impaired (BVI) people face challenges because they need manual support to prompt information about their environment.
In this work, we took our first step towards developing an affordable and high-performing eye wearable assistive device, DRISHTI.
arXiv Detail & Related papers (2023-03-13T20:10:44Z) - Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts,
Datasets and Metrics [77.34726150561087]
This work aims to carry out a study on the current scenario of camera and radar-based perception for ADAS and autonomous vehicles.
Concepts and characteristics related to both sensors, as well as to their fusion, are presented.
We give an overview of the Deep Learning-based detection and segmentation tasks, and the main datasets, metrics, challenges, and open questions in vehicle perception.
arXiv Detail & Related papers (2023-03-08T00:48:32Z) - Detect and Approach: Close-Range Navigation Support for People with
Blindness and Low Vision [13.478275180547925]
People with blindness and low vision (pBLV) experience significant challenges when locating final destinations or targeting specific objects in unfamiliar environments.
We develop a novel wearable navigation solution to provide real-time guidance for a user to approach a target object of interest efficiently and effectively in unfamiliar environments.
arXiv Detail & Related papers (2022-08-17T18:38:20Z) - VPAIR -- Aerial Visual Place Recognition and Localization in Large-scale
Outdoor Environments [49.82314641876602]
We present a new dataset named VPAIR.
The dataset was recorded on board a light aircraft flying at an altitude of more than 300 meters above ground.
The dataset covers a more than one hundred kilometers long trajectory over various types of challenging landscapes.
arXiv Detail & Related papers (2022-05-23T18:50:08Z) - Augmented reality navigation system for visual prosthesis [67.09251544230744]
We propose an augmented reality navigation system for visual prosthesis that incorporates a software of reactive navigation and path planning.
It consists on four steps: locating the subject on a map, planning the subject trajectory, showing it to the subject and re-planning without obstacles.
Results show how our augmented navigation system help navigation performance by reducing the time and distance to reach the goals, even significantly reducing the number of obstacles collisions.
arXiv Detail & Related papers (2021-09-30T09:41:40Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z) - Active Visual Information Gathering for Vision-Language Navigation [115.40768457718325]
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment.
This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent VLN policy.
arXiv Detail & Related papers (2020-07-15T23:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.