Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege
- URL: http://arxiv.org/abs/2401.15704v1
- Date: Sun, 28 Jan 2024 16:56:56 GMT
- Title: Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege
- Authors: Peng Huang, Yao Wei, Peng Cheng, Zhongjie Ba, Li Lu, Feng Lin, Yang Wang, Kui Ren,
- Abstract summary: We propose a novel phoneme-based noise with the idea of informational masking, which can distract both machines and humans.
Our system can reduce the recognition accuracy of recordings to below 50% under all tested speech recognition systems.
- Score: 26.3587130339825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The widespread smart devices raise people's concerns of being eavesdropped on. To enhance voice privacy, recent studies exploit the nonlinearity in microphone to jam audio recorders with inaudible ultrasound. However, existing solutions solely rely on energetic masking. Their simple-form noise leads to several problems, such as high energy requirements and being easily removed by speech enhancement techniques. Besides, most of these solutions do not support authorized recording, which restricts their usage scenarios. In this paper, we design an efficient yet robust system that can jam microphones while preserving authorized recording. Specifically, we propose a novel phoneme-based noise with the idea of informational masking, which can distract both machines and humans and is resistant to denoising techniques. Besides, we optimize the noise transmission strategy for broader coverage and implement a hardware prototype of our system. Experimental results show that our system can reduce the recognition accuracy of recordings to below 50\% under all tested speech recognition systems, which is much better than existing solutions.
Related papers
- Safeguarding Voice Privacy: Harnessing Near-Ultrasonic Interference To Protect Against Unauthorized Audio Recording [0.0]
This paper investigates the susceptibility of automatic speech recognition (ASR) algorithms to interference from near-ultrasonic noise.
We expose a critical vulnerability in the most common microphones used in modern voice-activated devices, which inadvertently demodulate near-ultrasonic frequencies into the audible spectrum.
Our findings highlight the need to develop robust countermeasures to protect voice-activated systems from malicious exploitation of this vulnerability.
arXiv Detail & Related papers (2024-04-07T00:49:19Z) - Proactive Detection of Voice Cloning with Localized Watermarking [50.13539630769929]
We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech.
AudioSeal employs a generator/detector architecture trained jointly with a localization loss to enable localized watermark detection up to the sample level.
AudioSeal achieves state-of-the-art performance in terms of robustness to real life audio manipulations and imperceptibility based on automatic and human evaluation metrics.
arXiv Detail & Related papers (2024-01-30T18:56:22Z) - In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction
Microphones for In-Ear Sensing Platforms [8.946335367620698]
This paper presents the design and implementation of a custom research platform for low-power wireless earbuds based on novel, commercial, MEMS bone-conduction microphones.
Such microphones can record the wearer's speech with much greater isolation, enabling personalized voice activity detection and further audio enhancement applications.
arXiv Detail & Related papers (2023-09-05T17:04:09Z) - SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using
Deep Neural Networks [18.968402215723]
A system to detect a user's unvoiced utterance is proposed.
Our proposed system recognizes the utterance contents without the user's uttering voice.
We also observed that a user can adjust their oral movement to learn and improve the accuracy of their voice recognition.
arXiv Detail & Related papers (2023-03-03T07:46:35Z) - Deepfake audio detection by speaker verification [79.99653758293277]
We propose a new detection approach that leverages only the biometric characteristics of the speaker, with no reference to specific manipulations.
The proposed approach can be implemented based on off-the-shelf speaker verification tools.
We test several such solutions on three popular test sets, obtaining good performance, high generalization ability, and high robustness to audio impairment.
arXiv Detail & Related papers (2022-09-28T13:46:29Z) - SuperVoice: Text-Independent Speaker Verification Using Ultrasound
Energy in Human Speech [10.354590276508283]
Voice-activated systems are integrated into a variety of desktop, mobile, and Internet-of-Things (IoT) devices.
Existing speaker verification techniques distinguish individual speakers via the spectrographic features extracted from an audible frequency range of voice commands.
We propose a speaker verification system, SUPERVOICE, that uses a two-stream architecture with a feature fusion mechanism to generate distinctive speaker models.
arXiv Detail & Related papers (2022-05-28T18:00:50Z) - Disappeared Command: Spoofing Attack On Automatic Speech Recognition
Systems with Sound Masking [2.9308762189250746]
Voice interfaces are becoming more and more widely used as input for many applications and smart devices.
DNN is easily disturbed by slight disturbances and makes false recognition, which is extremely dangerous for intelligent voice applications controlled by voice.
arXiv Detail & Related papers (2022-04-19T16:26:34Z) - Improving Noise Robustness of Contrastive Speech Representation Learning
with Speech Reconstruction [109.44933866397123]
Noise robustness is essential for deploying automatic speech recognition systems in real-world environments.
We employ a noise-robust representation learned by a refined self-supervised framework for noisy speech recognition.
We achieve comparable performance to the best supervised approach reported with only 16% of labeled data.
arXiv Detail & Related papers (2021-10-28T20:39:02Z) - Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
This paper proposes a Speech Enhancement model adapted to the task of WUW detection.
It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z) - VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device
Speech Recognition [60.462770498366524]
We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the device to preserve only the speech signals from a target user.
We show that such a model can be quantized as a 8-bit integer model and run in realtime.
arXiv Detail & Related papers (2020-09-09T14:26:56Z) - TinySpeech: Attention Condensers for Deep Speech Recognition Neural
Networks on Edge Devices [71.68436132514542]
We introduce the concept of attention condensers for building low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge.
To illustrate its efficacy, we introduce TinySpeech, low-precision deep neural networks tailored for on-device speech recognition.
arXiv Detail & Related papers (2020-08-10T16:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.