Newvision: application for helping blind people using deep learning
- URL: http://arxiv.org/abs/2311.03395v1
- Date: Sun, 5 Nov 2023 06:23:10 GMT
- Title: Newvision: application for helping blind people using deep learning
- Authors: Kumar Srinivas Bobba, Kartheeban K, Vamsi Krishna Sai Boddu, Vijaya
Mani Surendra Bolla, Dinesh Bugga
- Abstract summary: We are developing proprietary headgear that will help visually impaired people navigate their surroundings.
The headgear will use a combination of computer vision, distance estimation with ultrasonic sensors, voice recognition, and voice assistants.
Users will be able to interact with the headgear through voice commands, such as ''What is that?'' to identify an object.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As able-bodied people, we often take our vision for granted. For people who
are visually impaired, however, their disability can have a significant impact
on their daily lives. We are developing proprietary headgear that will help
visually impaired people navigate their surroundings, identify objects and
people, read text, and avoid obstacles. The headgear will use a combination of
computer vision, distance estimation with ultrasonic sensors, voice
recognition, and voice assistants to provide users with real-time information
about their environment. Users will be able to interact with the headgear
through voice commands, such as ''What is that?'' to identify an object or
''Navigate to the front door'' to find their way around. The headgear will then
provide the user with a verbal description of the object or spoken navigation
instructions. We believe that this headgear has the potential to make a
significant difference in the lives of visually impaired people, allowing them
to live more independently and participate more fully in society.
Related papers
- Real-Time Pill Identification for the Visually Impaired Using Deep Learning [31.747327310138314]
This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification.
The application aims to accurately recognize and differentiate between various pill types through real-time image processing on mobile devices.
arXiv Detail & Related papers (2024-05-08T03:18:46Z) - Improve accessibility for Low Vision and Blind people using Machine Learning and Computer Vision [0.0]
This project explores how machine learning and computer vision could be utilized to improve accessibility for people with visual impairments.
This project will concentrate on building a mobile application that helps blind people to orient in space by receiving audio and haptic feedback.
arXiv Detail & Related papers (2024-03-24T21:19:17Z) - Floor extraction and door detection for visually impaired guidance [78.94595951597344]
Finding obstacle-free paths in unknown environments is a big navigation issue for visually impaired people and autonomous robots.
New devices based on computer vision systems can help impaired people to overcome the difficulties of navigating in unknown environments in safe conditions.
In this work it is proposed a combination of sensors and algorithms that can lead to the building of a navigation system for visually impaired people.
arXiv Detail & Related papers (2024-01-30T14:38:43Z) - MagicEye: An Intelligent Wearable Towards Independent Living of Visually
Impaired [0.17499351967216337]
Vision impairment can severely impair a person's ability to work, navigate, and retain independence.
We present MagicEye, a state-of-the-art intelligent wearable device designed to assist visually impaired individuals.
With a total of 35 classes, the neural network employed by MagicEye has been specifically designed to achieve high levels of efficiency and precision in object detection.
arXiv Detail & Related papers (2023-03-24T08:59:35Z) - Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
Imitation Learning [62.83590925557013]
We learn a set of challenging partially-observed manipulation tasks from visual and audio inputs.
Our proposed system learns these tasks by combining offline imitation learning from tele-operated demonstrations and online finetuning.
In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by 20%.
arXiv Detail & Related papers (2022-05-30T04:52:58Z) - Can machines learn to see without visual databases? [93.73109506642112]
This paper focuses on developing machines that learn to see without needing to handle visual databases.
This might open the doors to a truly competitive track concerning deep learning technologies for vision.
arXiv Detail & Related papers (2021-10-12T13:03:54Z) - VisBuddy -- A Smart Wearable Assistant for the Visually Challenged [0.0]
VisBuddy is a voice-based assistant, where the user can give voice commands to perform specific tasks.
It uses the techniques of image captioning for describing the user's surroundings, optical character recognition (OCR) for reading the text in the user's view, object detection to search and find the objects in a room and web scraping to give the user the latest news.
arXiv Detail & Related papers (2021-08-17T17:15:23Z) - Assisted Perception: Optimizing Observations to Communicate State [112.40598205054994]
We aim to help users estimate the state of the world in tasks like robotic teleoperation and navigation with visual impairments.
We synthesize new observations that lead to more accurate internal state estimates when processed by the user.
arXiv Detail & Related papers (2020-08-06T19:08:05Z) - Does Visual Self-Supervision Improve Learning of Speech Representations
for Emotion Recognition? [63.564385139097624]
This work investigates visual self-supervision via face reconstruction to guide the learning of audio representations.
We show that a multi-task combination of the proposed visual and audio self-supervision is beneficial for learning richer features.
We evaluate our learned audio representations for discrete emotion recognition, continuous affect recognition and automatic speech recognition.
arXiv Detail & Related papers (2020-05-04T11:33:40Z) - Vision and Language: from Visual Perception to Content Creation [100.36776435627962]
"vision to language" is probably one of the most popular topics in the past five years.
This paper reviews the recent advances along these two dimensions: "vision to language" and "language to vision"
arXiv Detail & Related papers (2019-12-26T14:07:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.