A Deep Learning based Wearable Healthcare IoT Device for AI-enabled
Hearing Assistance Automation
- URL: http://arxiv.org/abs/2005.08076v1
- Date: Sat, 16 May 2020 19:42:16 GMT
- Title: A Deep Learning based Wearable Healthcare IoT Device for AI-enabled
Hearing Assistance Automation
- Authors: Fraser Young, L Zhang, Richard Jiang, Han Liu and Conor Wall
- Abstract summary: This research presents a novel AI-enabled Internet of Things (IoT) device capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations.
A server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people.
- Score: 6.283190933140046
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the recent booming of artificial intelligence (AI), particularly deep
learning techniques, digital healthcare is one of the prevalent areas that
could gain benefits from AI-enabled functionality. This research presents a
novel AI-enabled Internet of Things (IoT) device operating from the ESP-8266
platform capable of assisting those who suffer from impairment of hearing or
deafness to communicate with others in conversations. In the proposed solution,
a server application is created that leverages Google's online speech
recognition service to convert the received conversations into texts, then
deployed to a micro-display attached to the glasses to display the conversation
contents to deaf people, to enable and assist conversation as normal with the
general population. Furthermore, in order to raise alert of traffic or
dangerous scenarios, an 'urban-emergency' classifier is developed using a deep
learning model, Inception-v4, with transfer learning to detect/recognize
alerting/alarming sounds, such as a horn sound or a fire alarm, with texts
generated to alert the prospective user. The training of Inception-v4 was
carried out on a consumer desktop PC and then implemented into the AI based IoT
application. The empirical results indicate that the developed prototype system
achieves an accuracy rate of 92% for sound recognition and classification with
real-time performance.
Related papers
- SONAR: A Synthetic AI-Audio Detection Framework and Benchmark [59.09338266364506]
SONAR is a synthetic AI-Audio Detection Framework and Benchmark.
It aims to provide a comprehensive evaluation for distinguishing cutting-edge AI-synthesized auditory content.
It is the first framework to uniformly benchmark AI-audio detection across both traditional and foundation model-based deepfake detection systems.
arXiv Detail & Related papers (2024-10-06T01:03:42Z) - Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance.
We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information.
Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z) - MindSpeech: Continuous Imagined Speech Decoding using High-Density fNIRS and Prompt Tuning for Advanced Human-AI Interaction [0.0]
This paper reports a novel method for human-AI interaction by developing a direct brain-AI interface.
We discuss a novel AI model, called MindSpeech, which enables open-vocabulary, continuous decoding for imagined speech.
We demonstrate significant improvements in key metrics, such as BLEU-1 and BERT P scores, for three out of four participants.
arXiv Detail & Related papers (2024-07-25T16:39:21Z) - AIris: An AI-powered Wearable Assistive Device for the Visually Impaired [0.0]
We introduce AIris, an AI-powered wearable device that provides environmental awareness and interaction capabilities to visually impaired users.
We have created a functional prototype system that operates effectively in real-world conditions.
arXiv Detail & Related papers (2024-05-13T10:09:37Z) - Mediapipe and CNNs for Real-Time ASL Gesture Recognition [0.1529342790344802]
This research paper describes a realtime system for identifying American Sign Language (ASL) movements.
The suggested method makes use of the Mediapipe library for feature extraction and a Convolutional Neural Network (CNN) for ASL gesture classification.
arXiv Detail & Related papers (2023-05-09T09:35:45Z) - Contextual-Utterance Training for Automatic Speech Recognition [65.4571135368178]
We propose a contextual-utterance training technique which makes use of the previous and future contextual utterances.
Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems.
The proposed technique is able to reduce both the WER and the average last token emission latency by more than 6% and 40ms relative.
arXiv Detail & Related papers (2022-10-27T08:10:44Z) - Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
Imitation Learning [62.83590925557013]
We learn a set of challenging partially-observed manipulation tasks from visual and audio inputs.
Our proposed system learns these tasks by combining offline imitation learning from tele-operated demonstrations and online finetuning.
In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by 20%.
arXiv Detail & Related papers (2022-05-30T04:52:58Z) - Disappeared Command: Spoofing Attack On Automatic Speech Recognition
Systems with Sound Masking [2.9308762189250746]
Voice interfaces are becoming more and more widely used as input for many applications and smart devices.
DNN is easily disturbed by slight disturbances and makes false recognition, which is extremely dangerous for intelligent voice applications controlled by voice.
arXiv Detail & Related papers (2022-04-19T16:26:34Z) - Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches
for Automatic Speech Recognition Systems [0.0]
This work aims to present the process of building a dataset of noisy audios, in a specific case of degenerated audios due to interference.
We also present initial results of a classifier that uses such data for evaluation, indicating the benefits of using this dataset in the recognizer's training process.
arXiv Detail & Related papers (2021-10-04T13:08:53Z) - Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
This paper proposes a Speech Enhancement model adapted to the task of WUW detection.
It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.