Voice Morphing: Two Identities in One Voice
- URL: http://arxiv.org/abs/2309.02404v1
- Date: Tue, 5 Sep 2023 17:36:34 GMT
- Title: Voice Morphing: Two Identities in One Voice
- Authors: Sushanta K. Pani, Anurag Chowdhury, Morgan Sandler, Arun Ross
- Abstract summary: We introduce Voice Identity Morphing (VIM) - a voice-based morph attack that can synthesize speech samples that impersonate the voice characteristics of a pair of individuals.
VIM has a success rate (MMPMR) of over 80% at a false match rate of 1% on the Librispeech dataset.
- Score: 12.404748962951157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a biometric system, each biometric sample or template is typically
associated with a single identity. However, recent research has demonstrated
the possibility of generating "morph" biometric samples that can successfully
match more than a single identity. Morph attacks are now recognized as a
potential security threat to biometric systems. However, most morph attacks
have been studied on biometric modalities operating in the image domain, such
as face, fingerprint, and iris. In this preliminary work, we introduce Voice
Identity Morphing (VIM) - a voice-based morph attack that can synthesize speech
samples that impersonate the voice characteristics of a pair of individuals.
Our experiments evaluate the vulnerabilities of two popular speaker recognition
systems, ECAPA-TDNN and x-vector, to VIM, with a success rate (MMPMR) of over
80% at a false match rate of 1% on the Librispeech dataset.
Related papers
- Deep CNN Face Matchers Inherently Support Revocable Biometric Templates [7.448523610526267]
A biometric scheme is revocable if an individual can have their current enrollment in the scheme revoked.<n>We show that modern deep CNN face matchers inherently allow for a robust revocable biometric scheme.
arXiv Detail & Related papers (2025-06-23T15:09:04Z) - Where are we in audio deepfake detection? A systematic analysis over generative and detection models [59.09338266364506]
SONAR is a synthetic AI-Audio Detection Framework and Benchmark.
It provides a comprehensive evaluation for distinguishing cutting-edge AI-synthesized auditory content.
It is the first framework to uniformly benchmark AI-audio detection across both traditional and foundation model-based detection systems.
arXiv Detail & Related papers (2024-10-06T01:03:42Z) - Anomalous Sound Detection using Audio Representation with Machine ID
based Contrastive Learning Pretraining [52.191658157204856]
This paper uses contrastive learning to refine audio representations for each machine ID, rather than for each audio sample.
The proposed two-stage method uses contrastive learning to pretrain the audio representation model.
Experiments show that our method outperforms the state-of-the-art methods using contrastive learning or self-supervised classification.
arXiv Detail & Related papers (2023-04-07T11:08:31Z) - Untargeted Near-collision Attacks on Biometrics: Real-world Bounds and
Theoretical Limits [0.0]
We focus on untargeted attacks that can be carried out both online and offline, and in both identification and verification modes.
We use the False Match Rate (FMR) and the False Positive Identification Rate (FPIR) to address the security of these systems.
The study of this metric space, and system parameters, gives us the complexity of untargeted attacks and the probability of a near-collision.
arXiv Detail & Related papers (2023-04-04T07:17:31Z) - OTB-morph: One-Time Biometrics via Morphing [16.23764869038004]
This paper introduces a new idea to exploit as a transformation function for cancelable biometrics.
An experimental implementation of the proposed scheme is given for face biometrics.
arXiv Detail & Related papers (2023-02-17T18:39:40Z) - Leveraging Diffusion For Strong and High Quality Face Morphing Attacks [2.0795007613453445]
Face morphing attacks seek to deceive a Face Recognition (FR) system by presenting a morphed image consisting of the biometric qualities from two different identities.
We present a novel morphing attack that uses a Diffusion-based architecture to improve the visual fidelity of the image.
arXiv Detail & Related papers (2023-01-10T21:50:26Z) - Facial Soft Biometrics for Recognition in the Wild: Recent Works,
Annotation, and COTS Evaluation [63.05890836038913]
We study the role of soft biometrics to enhance person recognition systems in unconstrained scenarios.
We consider two assumptions: 1) manual estimation of soft biometrics and 2) automatic estimation from two commercial off-the-shelf systems.
Experiments are carried out fusing soft biometrics with two state-of-the-art face recognition systems based on deep learning.
arXiv Detail & Related papers (2022-10-24T11:29:57Z) - Are GAN-based Morphs Threatening Face Recognition? [3.0921354926071274]
This paper bridges the gap by providing datasets and the corresponding code for four types of morphing attacks.
We also conduct extensive experiments to assess the vulnerability of four state-of-the-art face recognition systems.
arXiv Detail & Related papers (2022-05-05T08:19:47Z) - OTB-morph: One-Time Biometrics via Morphing applied to Face Templates [8.623680649444212]
This paper introduces a new scheme for cancelable biometrics aimed at protecting the templates against potential attacks.
An experimental implementation of the proposed scheme is given for face biometrics.
arXiv Detail & Related papers (2021-11-25T18:35:34Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Exploring Deep Learning for Joint Audio-Visual Lip Biometrics [54.32039064193566]
Audio-visual (AV) lip biometrics is a promising authentication technique that leverages the benefits of both the audio and visual modalities in speech communication.
The lack of a sizeable AV database hinders the exploration of deep-learning-based audio-visual lip biometrics.
We establish the DeepLip AV lip biometrics system realized with a convolutional neural network (CNN) based video module, a time-delay neural network (TDNN) based audio module, and a multimodal fusion module.
arXiv Detail & Related papers (2021-04-17T10:51:55Z) - Many-to-Many Voice Transformer Network [55.17770019619078]
This paper proposes a voice conversion (VC) method based on a sequence-to-sequence (S2S) learning framework.
It enables simultaneous conversion of the voice characteristics, pitch contour, and duration of input speech.
arXiv Detail & Related papers (2020-05-18T04:02:08Z) - Keystroke Biometrics in Response to Fake News Propagation in a Global
Pandemic [77.79066811371978]
This work proposes and analyzes the use of keystroke biometrics for content de-anonymization.
Fake news have become a powerful tool to manipulate public opinion, especially during major events.
arXiv Detail & Related papers (2020-05-15T17:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.