A Deep Learning based Wearable Healthcare IoT Device for AI-enabled
Hearing Assistance Automation
- URL: http://arxiv.org/abs/2005.08076v1
- Date: Sat, 16 May 2020 19:42:16 GMT
- Title: A Deep Learning based Wearable Healthcare IoT Device for AI-enabled
Hearing Assistance Automation
- Authors: Fraser Young, L Zhang, Richard Jiang, Han Liu and Conor Wall
- Abstract summary: This research presents a novel AI-enabled Internet of Things (IoT) device capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations.
A server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people.
- Score: 6.283190933140046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the recent booming of artificial intelligence (AI), particularly deep
learning techniques, digital healthcare is one of the prevalent areas that
could gain benefits from AI-enabled functionality. This research presents a
novel AI-enabled Internet of Things (IoT) device operating from the ESP-8266
platform capable of assisting those who suffer from impairment of hearing or
deafness to communicate with others in conversations. In the proposed solution,
a server application is created that leverages Google's online speech
recognition service to convert the received conversations into texts, then
deployed to a micro-display attached to the glasses to display the conversation
contents to deaf people, to enable and assist conversation as normal with the
general population. Furthermore, in order to raise alert of traffic or
dangerous scenarios, an 'urban-emergency' classifier is developed using a deep
learning model, Inception-v4, with transfer learning to detect/recognize
alerting/alarming sounds, such as a horn sound or a fire alarm, with texts
generated to alert the prospective user. The training of Inception-v4 was
carried out on a consumer desktop PC and then implemented into the AI based IoT
application. The empirical results indicate that the developed prototype system
achieves an accuracy rate of 92% for sound recognition and classification with
real-time performance.
Related papers
- Automatic Speech Recognition for Hindi [0.6292138336765964]
The research involved developing a web application and designing a web interface for speech recognition.
The web application manages large volumes of audio files and their transcriptions, facilitating human correction of ASR transcripts.
The web interface for speech recognition records 16 kHz mono audio from any device running the web app, performs voice activity detection (VAD), and sends the audio to the recognition engine.
arXiv Detail & Related papers (2024-06-26T07:39:20Z) - AIris: An AI-powered Wearable Assistive Device for the Visually Impaired [0.0]
We introduce AIris, an AI-powered wearable device that provides environmental awareness and interaction capabilities to visually impaired users.
We have created a functional prototype system that operates effectively in real-world conditions.
arXiv Detail & Related papers (2024-05-13T10:09:37Z) - Mediapipe and CNNs for Real-Time ASL Gesture Recognition [0.1529342790344802]
This research paper describes a realtime system for identifying American Sign Language (ASL) movements.
The suggested method makes use of the Mediapipe library for feature extraction and a Convolutional Neural Network (CNN) for ASL gesture classification.
arXiv Detail & Related papers (2023-05-09T09:35:45Z) - Contextual-Utterance Training for Automatic Speech Recognition [65.4571135368178]
We propose a contextual-utterance training technique which makes use of the previous and future contextual utterances.
Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems.
The proposed technique is able to reduce both the WER and the average last token emission latency by more than 6% and 40ms relative.
arXiv Detail & Related papers (2022-10-27T08:10:44Z) - Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
Imitation Learning [62.83590925557013]
We learn a set of challenging partially-observed manipulation tasks from visual and audio inputs.
Our proposed system learns these tasks by combining offline imitation learning from tele-operated demonstrations and online finetuning.
In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by 20%.
arXiv Detail & Related papers (2022-05-30T04:52:58Z) - Disappeared Command: Spoofing Attack On Automatic Speech Recognition
Systems with Sound Masking [2.9308762189250746]
Voice interfaces are becoming more and more widely used as input for many applications and smart devices.
DNN is easily disturbed by slight disturbances and makes false recognition, which is extremely dangerous for intelligent voice applications controlled by voice.
arXiv Detail & Related papers (2022-04-19T16:26:34Z) - Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches
for Automatic Speech Recognition Systems [0.0]
This work aims to present the process of building a dataset of noisy audios, in a specific case of degenerated audios due to interference.
We also present initial results of a classifier that uses such data for evaluation, indicating the benefits of using this dataset in the recognizer's training process.
arXiv Detail & Related papers (2021-10-04T13:08:53Z) - Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource
End-to-End Speech Recognition [62.94773371761236]
We consider building an effective end-to-end ASR system in low-resource setups with a high OOV rate.
We propose a method of dynamic acoustic unit augmentation based on the BPE-dropout technique.
Our monolingual Turkish Conformer established a competitive result with 22.2% character error rate (CER) and 38.9% word error rate (WER)
arXiv Detail & Related papers (2021-03-12T10:10:13Z) - Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
This paper proposes a Speech Enhancement model adapted to the task of WUW detection.
It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - TinySpeech: Attention Condensers for Deep Speech Recognition Neural
Networks on Edge Devices [71.68436132514542]
We introduce the concept of attention condensers for building low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge.
To illustrate its efficacy, we introduce TinySpeech, low-precision deep neural networks tailored for on-device speech recognition.
arXiv Detail & Related papers (2020-08-10T16:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.