Related papers: A Deep Learning based Wearable Healthcare IoT Device for AI-enabled Hearing Assistance Automation

A Deep Learning based Wearable Healthcare IoT Device for AI-enabled Hearing Assistance Automation

URL: http://arxiv.org/abs/2005.08076v1
Date: Sat, 16 May 2020 19:42:16 GMT
Title: A Deep Learning based Wearable Healthcare IoT Device for AI-enabled Hearing Assistance Automation
Authors: Fraser Young, L Zhang, Richard Jiang, Han Liu and Conor Wall
Abstract summary: This research presents a novel AI-enabled Internet of Things (IoT) device capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations. A server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people.
Score: 6.283190933140046
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the recent booming of artificial intelligence (AI), particularly deep learning techniques, digital healthcare is one of the prevalent areas that could gain benefits from AI-enabled functionality. This research presents a novel AI-enabled Internet of Things (IoT) device operating from the ESP-8266 platform capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations. In the proposed solution, a server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people, to enable and assist conversation as normal with the general population. Furthermore, in order to raise alert of traffic or dangerous scenarios, an 'urban-emergency' classifier is developed using a deep learning model, Inception-v4, with transfer learning to detect/recognize alerting/alarming sounds, such as a horn sound or a fire alarm, with texts generated to alert the prospective user. The training of Inception-v4 was carried out on a consumer desktop PC and then implemented into the AI based IoT application. The empirical results indicate that the developed prototype system achieves an accuracy rate of 92% for sound recognition and classification with real-time performance.

Related papers

AI-based Wearable Vision Assistance System for the Visually Impaired: Integrating Real-Time Object Recognition and Contextual Understanding Using Large Vision-Language Models [0.0]
This paper introduces a novel wearable vision assistance system with artificial intelligence (AI) technology to deliver real-time feedback to a user through a sound beep mechanism. The system provides detailed descriptions of objects in the user's environment using a large vision language model (LVLM)
arXiv Detail & Related papers (2024-12-28T07:26:39Z)
SONAR: A Synthetic AI-Audio Detection Framework and Benchmark [59.09338266364506]
SONAR is a synthetic AI-Audio Detection Framework and Benchmark. It aims to provide a comprehensive evaluation for distinguishing cutting-edge AI-synthesized auditory content. It is the first framework to uniformly benchmark AI-audio detection across both traditional and foundation model-based deepfake detection systems.
arXiv Detail & Related papers (2024-10-06T01:03:42Z)
Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance. We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information. Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z)
MindSpeech: Continuous Imagined Speech Decoding using High-Density fNIRS and Prompt Tuning for Advanced Human-AI Interaction [0.0]
This paper reports a novel method for human-AI interaction by developing a direct brain-AI interface. We discuss a novel AI model, called MindSpeech, which enables open-vocabulary, continuous decoding for imagined speech. We demonstrate significant improvements in key metrics, such as BLEU-1 and BERT P scores, for three out of four participants.
arXiv Detail & Related papers (2024-07-25T16:39:21Z)
AIris: An AI-powered Wearable Assistive Device for the Visually Impaired [0.0]
We introduce AIris, an AI-powered wearable device that provides environmental awareness and interaction capabilities to visually impaired users. We have created a functional prototype system that operates effectively in real-world conditions.
arXiv Detail & Related papers (2024-05-13T10:09:37Z)
Mediapipe and CNNs for Real-Time ASL Gesture Recognition [0.1529342790344802]
This research paper describes a realtime system for identifying American Sign Language (ASL) movements. The suggested method makes use of the Mediapipe library for feature extraction and a Convolutional Neural Network (CNN) for ASL gesture classification.
arXiv Detail & Related papers (2023-05-09T09:35:45Z)
Contextual-Utterance Training for Automatic Speech Recognition [65.4571135368178]
We propose a contextual-utterance training technique which makes use of the previous and future contextual utterances. Also, we propose a dual-mode contextual-utterance training technique for streaming automatic speech recognition (ASR) systems. The proposed technique is able to reduce both the WER and the average last token emission latency by more than 6% and 40ms relative.
arXiv Detail & Related papers (2022-10-27T08:10:44Z)
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning [62.83590925557013]
We learn a set of challenging partially-observed manipulation tasks from visual and audio inputs. Our proposed system learns these tasks by combining offline imitation learning from tele-operated demonstrations and online finetuning. In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by 20%.
arXiv Detail & Related papers (2022-05-30T04:52:58Z)
Disappeared Command: Spoofing Attack On Automatic Speech Recognition Systems with Sound Masking [2.9308762189250746]
Voice interfaces are becoming more and more widely used as input for many applications and smart devices. DNN is easily disturbed by slight disturbances and makes false recognition, which is extremely dangerous for intelligent voice applications controlled by voice.
arXiv Detail & Related papers (2022-04-19T16:26:34Z)
Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems [0.0]
This work aims to present the process of building a dataset of noisy audios, in a specific case of degenerated audios due to interference. We also present initial results of a classifier that uses such data for evaluation, indicating the benefits of using this dataset in the recognizer's training process.
arXiv Detail & Related papers (2021-10-04T13:08:53Z)
Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants. This paper proposes a Speech Enhancement model adapted to the task of WUW detection. It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z)
Speaker De-identification System using Autoencoders and Adversarial Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders. Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z)
TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices [71.68436132514542]
We introduce the concept of attention condensers for building low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge. To illustrate its efficacy, we introduce TinySpeech, low-precision deep neural networks tailored for on-device speech recognition.
arXiv Detail & Related papers (2020-08-10T16:34:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.