Disappeared Command: Spoofing Attack On Automatic Speech Recognition
Systems with Sound Masking
- URL: http://arxiv.org/abs/2204.08977v1
- Date: Tue, 19 Apr 2022 16:26:34 GMT
- Title: Disappeared Command: Spoofing Attack On Automatic Speech Recognition
Systems with Sound Masking
- Authors: Jinghui Xu, Jiangshan Zhang, Jifeng Zhu and Yong Yang
- Abstract summary: Voice interfaces are becoming more and more widely used as input for many applications and smart devices.
DNN is easily disturbed by slight disturbances and makes false recognition, which is extremely dangerous for intelligent voice applications controlled by voice.
- Score: 2.9308762189250746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The development of deep learning technology has greatly promoted the
performance improvement of automatic speech recognition (ASR) technology, which
has demonstrated an ability comparable to human hearing in many tasks. Voice
interfaces are becoming more and more widely used as input for many
applications and smart devices. However, existing research has shown that DNN
is easily disturbed by slight disturbances and makes false recognition, which
is extremely dangerous for intelligent voice applications controlled by voice.
Related papers
- Towards Unsupervised Speech Recognition Without Pronunciation Models [57.222729245842054]
Most languages lack sufficient paired speech and text data to effectively train automatic speech recognition systems.
We propose the removal of reliance on a phoneme lexicon to develop unsupervised ASR systems.
We experimentally demonstrate that an unsupervised speech recognizer can emerge from joint speech-to-speech and text-to-text masked token-infilling.
arXiv Detail & Related papers (2024-06-12T16:30:58Z) - Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege [26.3587130339825]
We propose a novel phoneme-based noise with the idea of informational masking, which can distract both machines and humans.
Our system can reduce the recognition accuracy of recordings to below 50% under all tested speech recognition systems.
arXiv Detail & Related papers (2024-01-28T16:56:56Z) - Efficient Multimodal Neural Networks for Trigger-less Voice Assistants [0.8209843760716959]
We propose a neural network based audio-gesture multimodal fusion system for smartwatches.
The system better understands temporal correlation between audio and gesture data, leading to precise invocations.
It is lightweight and deployable on low-power devices, such as smartwatches, with quick launch times.
arXiv Detail & Related papers (2023-05-20T02:52:02Z) - SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using
Deep Neural Networks [18.968402215723]
A system to detect a user's unvoiced utterance is proposed.
Our proposed system recognizes the utterance contents without the user's uttering voice.
We also observed that a user can adjust their oral movement to learn and improve the accuracy of their voice recognition.
arXiv Detail & Related papers (2023-03-03T07:46:35Z) - Deepfake audio detection by speaker verification [79.99653758293277]
We propose a new detection approach that leverages only the biometric characteristics of the speaker, with no reference to specific manipulations.
The proposed approach can be implemented based on off-the-shelf speaker verification tools.
We test several such solutions on three popular test sets, obtaining good performance, high generalization ability, and high robustness to audio impairment.
arXiv Detail & Related papers (2022-09-28T13:46:29Z) - Adversarial Attacks on Speech Recognition Systems for Mission-Critical
Applications: A Survey [8.86498196260453]
Adversarial Artificial Intelligence (AI) is a growing threat in the AI and machine learning research community.
In this paper, we first review existing speech recognition techniques, then, we investigate the effectiveness of adversarial attacks and defenses against these systems.
This paper is expected to serve researchers and practitioners as a reference to help them in understanding the challenges, position themselves and, ultimately, help them to improve existing models of speech recognition for mission-critical applications.
arXiv Detail & Related papers (2022-02-22T00:29:40Z) - Recent Progress in the CUHK Dysarthric Speech Recognition System [66.69024814159447]
Disordered speech presents a wide spectrum of challenges to current data intensive deep neural networks (DNNs) based automatic speech recognition technologies.
This paper presents recent research efforts at the Chinese University of Hong Kong to improve the performance of disordered speech recognition systems.
arXiv Detail & Related papers (2022-01-15T13:02:40Z) - Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
This paper proposes a Speech Enhancement model adapted to the task of WUW detection.
It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - TinySpeech: Attention Condensers for Deep Speech Recognition Neural
Networks on Edge Devices [71.68436132514542]
We introduce the concept of attention condensers for building low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge.
To illustrate its efficacy, we introduce TinySpeech, low-precision deep neural networks tailored for on-device speech recognition.
arXiv Detail & Related papers (2020-08-10T16:34:52Z) - A Deep Learning based Wearable Healthcare IoT Device for AI-enabled
Hearing Assistance Automation [6.283190933140046]
This research presents a novel AI-enabled Internet of Things (IoT) device capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations.
A server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people.
arXiv Detail & Related papers (2020-05-16T19:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.