Can DeepFake Speech be Reliably Detected?
- URL: http://arxiv.org/abs/2410.06572v1
- Date: Wed, 9 Oct 2024 06:13:48 GMT
- Title: Can DeepFake Speech be Reliably Detected?
- Authors: Hongbin Liu, Youzheng Chen, Arun Narayanan, Athula Balachandran, Pedro J. Moreno, Lun Wang,
- Abstract summary: This work presents the first systematic study of active malicious attacks against state-of-the-art open-source speech detectors.
The results highlight the urgent need for more robust detection methods in the face of evolving adversarial threats.
- Score: 17.10792531439146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in text-to-speech (TTS) systems, particularly those with voice cloning capabilities, have made voice impersonation readily accessible, raising ethical and legal concerns due to potential misuse for malicious activities like misinformation campaigns and fraud. While synthetic speech detectors (SSDs) exist to combat this, they are vulnerable to ``test domain shift", exhibiting decreased performance when audio is altered through transcoding, playback, or background noise. This vulnerability is further exacerbated by deliberate manipulation of synthetic speech aimed at deceiving detectors. This work presents the first systematic study of such active malicious attacks against state-of-the-art open-source SSDs. White-box attacks, black-box attacks, and their transferability are studied from both attack effectiveness and stealthiness, using both hardcoded metrics and human ratings. The results highlight the urgent need for more robust detection methods in the face of evolving adversarial threats.
Related papers
- D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack [1.7811840395202345]
Recent research has proposed a D-CAPTCHA system based on the challenge-response protocol to differentiate fake phone calls from real ones.
In this work, we study the resilience of this system and introduce a more robust version, D-CAPTCHA++, to defend against fake calls.
arXiv Detail & Related papers (2024-09-11T16:25:02Z) - Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition [4.164975438207411]
In recent years, the typical backdoor attacks have been researched in speech recognition systems.
The attacker adds some incorporated changes to benign speech spectrograms or changes the speech components, such as pitch and timbre.
To improve the stealthiness of data poisoning, we propose a non-neural and fast algorithm called Random Spectrogram Rhythm Transformation.
arXiv Detail & Related papers (2024-06-16T13:29:21Z) - The Art of Deception: Robust Backdoor Attack using Dynamic Stacking of Triggers [0.0]
Recent research has uncovered that auditory backdoors may use certain modifications as their initiating mechanism.
DynamicTrigger is introduced as a methodology for carrying out dynamic backdoor attacks.
By utilizing fluctuating signal sampling rates and masking speaker identities through dynamic sound triggers, it is possible to deceive speech recognition systems.
arXiv Detail & Related papers (2024-01-03T04:31:59Z) - Defense Against Adversarial Attacks on Audio DeepFake Detection [0.4511923587827302]
Audio DeepFakes (DF) are artificially generated utterances created using deep learning.
Multiple neural network-based methods to detect generated speech have been proposed to prevent the threats.
arXiv Detail & Related papers (2022-12-30T08:41:06Z) - Deepfake audio detection by speaker verification [79.99653758293277]
We propose a new detection approach that leverages only the biometric characteristics of the speaker, with no reference to specific manipulations.
The proposed approach can be implemented based on off-the-shelf speaker verification tools.
We test several such solutions on three popular test sets, obtaining good performance, high generalization ability, and high robustness to audio impairment.
arXiv Detail & Related papers (2022-09-28T13:46:29Z) - Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers.
Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods.
Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z) - Partially Fake Audio Detection by Self-attention-based Fake Span
Discovery [89.21979663248007]
We propose a novel framework by introducing the question-answering (fake span discovery) strategy with the self-attention mechanism to detect partially fake audios.
Our submission ranked second in the partially fake audio detection track of ADD 2022.
arXiv Detail & Related papers (2022-02-14T13:20:55Z) - Practical Attacks on Voice Spoofing Countermeasures [3.388509725285237]
We show how a malicious actor may efficiently craft audio samples to bypass voice authentication in its strictest form.
Our results call into question the security of modern voice authentication systems in light of the real threat of attackers bypassing these measures.
arXiv Detail & Related papers (2021-07-30T14:07:49Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - Silent Speech Interfaces for Speech Restoration: A Review [59.68902463890532]
Silent speech interface (SSI) research aims to provide alternative and augmentative communication methods for persons with severe speech disorders.
SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication.
Most present-day SSIs have only been validated in laboratory settings for healthy users.
arXiv Detail & Related papers (2020-09-04T11:05:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.