Design Choices for X-vector Based Speaker Anonymization
- URL: http://arxiv.org/abs/2005.08601v1
- Date: Mon, 18 May 2020 11:32:14 GMT
- Title: Design Choices for X-vector Based Speaker Anonymization
- Authors: Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel
Vincent, Junichi Yamagishi, Mohamed Maouche, Aur\'elien Bellet, Marc Tommasi
- Abstract summary: We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge.
Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
- Score: 48.46018902334472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recently proposed x-vector based anonymization scheme converts any input
voice into that of a random pseudo-speaker. In this paper, we present a
flexible pseudo-speaker selection technique as a baseline for the first
VoicePrivacy Challenge. We explore several design choices for the distance
metric between speakers, the region of x-vector space where the pseudo-speaker
is picked, and gender selection. To assess the strength of anonymization
achieved, we consider attackers using an x-vector based speaker verification
system who may use original or anonymized speech for enrollment, depending on
their knowledge of the anonymization scheme. The Equal Error Rate (EER)
achieved by the attackers and the decoding Word Error Rate (WER) over
anonymized data are reported as the measures of privacy and utility.
Experiments are performed using datasets derived from LibriSpeech to find the
optimal combination of design choices in terms of privacy and utility.
Related papers
- Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models [83.7506131809624]
We introduce an approach to identifying speaker names in dialogue transcripts, a crucial task for enhancing content accessibility and searchability in digital media archives.
We present a novel, large-scale dataset derived from the MediaSum corpus, encompassing transcripts from a wide range of media sources.
We propose novel transformer-based models tailored for SpeakerID, leveraging contextual cues within dialogues to accurately attribute speaker names.
arXiv Detail & Related papers (2024-07-16T18:03:58Z) - A Benchmark for Multi-speaker Anonymization [9.990701310620368]
We present an attempt to provide a multi-speaker anonymization benchmark for real-world applications.
A cascaded system uses speaker diarization to aggregate the speech of each speaker and speaker anonymization to conceal speaker privacy and preserve speech content.
Experiments conducted on both non-overlap simulated and real-world datasets demonstrate the effectiveness of the multi-speaker anonymization system.
arXiv Detail & Related papers (2024-07-08T04:48:43Z) - Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.
We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.
SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z) - Anonymizing Speech: Evaluating and Designing Speaker Anonymization
Techniques [1.2691047660244337]
The growing use of voice user interfaces has led to a surge in the collection and storage of speech data.
This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization.
arXiv Detail & Related papers (2023-08-05T16:14:17Z) - Vocoder drift compensation by x-vector alignment in speaker
anonymisation [11.480724899031149]
This paper explores the origin of so-called vocoder drift and shows that it is due to the mismatch between the substituted x-vector and the original representations of the linguistic content, intonation and prosody.
Also reported is an original approach to vocoder drift compensation.
arXiv Detail & Related papers (2023-07-17T11:38:35Z) - Speaker Embedding-aware Neural Diarization: a Novel Framework for
Overlapped Speech Diarization in the Meeting Scenario [51.5031673695118]
We reformulate overlapped speech diarization as a single-label prediction problem.
We propose the speaker embedding-aware neural diarization (SEND) system.
arXiv Detail & Related papers (2022-03-18T06:40:39Z) - Privacy-Preserving Speech Representation Learning using Vector
Quantization [0.0]
Speech signals contain a lot of sensitive information, such as the speaker's identity, which raises privacy concerns.
This paper aims to produce an anonymous representation while preserving speech recognition performance.
arXiv Detail & Related papers (2022-03-15T14:01:11Z) - End-to-End Diarization for Variable Number of Speakers with Local-Global
Networks and Discriminative Speaker Embeddings [66.50782702086575]
We present an end-to-end deep network model that performs meeting diarization from single-channel audio recordings.
The proposed system is designed to handle meetings with unknown numbers of speakers, using variable-number permutation-invariant cross-entropy based loss functions.
arXiv Detail & Related papers (2021-05-05T14:55:29Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - Speaker Anonymization with Distribution-Preserving X-Vector Generation
for the VoicePrivacy Challenge 2020 [19.420608243033794]
We present a Distribution-Preserving Voice Anonymization technique, as our submission to the VoicePrivacy Challenge 2020.
We show how this approach generates X-vectors that more closely follow the expected intra-similarity distribution of organic speaker X-vectors.
arXiv Detail & Related papers (2020-10-26T09:53:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.