Design Choices for X-vector Based Speaker Anonymization
        - URL: http://arxiv.org/abs/2005.08601v1
 - Date: Mon, 18 May 2020 11:32:14 GMT
 - Title: Design Choices for X-vector Based Speaker Anonymization
 - Authors: Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel
  Vincent, Junichi Yamagishi, Mohamed Maouche, Aur\'elien Bellet, Marc Tommasi
 - Abstract summary: We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge.
Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
 - Score: 48.46018902334472
 - License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
 - Abstract:   The recently proposed x-vector based anonymization scheme converts any input
voice into that of a random pseudo-speaker. In this paper, we present a
flexible pseudo-speaker selection technique as a baseline for the first
VoicePrivacy Challenge. We explore several design choices for the distance
metric between speakers, the region of x-vector space where the pseudo-speaker
is picked, and gender selection. To assess the strength of anonymization
achieved, we consider attackers using an x-vector based speaker verification
system who may use original or anonymized speech for enrollment, depending on
their knowledge of the anonymization scheme. The Equal Error Rate (EER)
achieved by the attackers and the decoding Word Error Rate (WER) over
anonymized data are reported as the measures of privacy and utility.
Experiments are performed using datasets derived from LibriSpeech to find the
optimal combination of design choices in terms of privacy and utility.
 
       
      
        Related papers
        - Speaker Embeddings to Improve Tracking of Intermittent and Moving   Speakers [53.12031345322412]
We propose to perform identity reassignment post-tracking, using speaker embeddings.<n>Beamforming is used to enhance the signal towards the speakers' positions in order to compute speaker embeddings.<n>We evaluate the performance of the proposed speaker embedding-based identity reassignment method on a dataset where speakers change position during inactivity periods.
arXiv  Detail & Related papers  (2025-06-23T13:02:20Z) - Identifying Speakers in Dialogue Transcripts: A Text-based Approach   Using Pretrained Language Models [83.7506131809624]
We introduce an approach to identifying speaker names in dialogue transcripts, a crucial task for enhancing content accessibility and searchability in digital media archives.
We present a novel, large-scale dataset derived from the MediaSum corpus, encompassing transcripts from a wide range of media sources.
We propose novel transformer-based models tailored for SpeakerID, leveraging contextual cues within dialogues to accurately attribute speaker names.
arXiv  Detail & Related papers  (2024-07-16T18:03:58Z) - A Benchmark for Multi-speaker Anonymization [9.990701310620368]
We present an attempt to provide a multi-speaker anonymization benchmark for real-world applications.
A cascaded system uses speaker diarization to aggregate the speech of each speaker and speaker anonymization to conceal speaker privacy and preserve speech content.
 Experiments conducted on both non-overlap simulated and real-world datasets demonstrate the effectiveness of the multi-speaker anonymization system.
arXiv  Detail & Related papers  (2024-07-08T04:48:43Z) - Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.
We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.
 SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv  Detail & Related papers  (2024-03-26T09:25:57Z) - Anonymizing Speech: Evaluating and Designing Speaker Anonymization
  Techniques [1.2691047660244337]
The growing use of voice user interfaces has led to a surge in the collection and storage of speech data.
This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization.
arXiv  Detail & Related papers  (2023-08-05T16:14:17Z) - Vocoder drift compensation by x-vector alignment in speaker
  anonymisation [11.480724899031149]
This paper explores the origin of so-called vocoder drift and shows that it is due to the mismatch between the substituted x-vector and the original representations of the linguistic content, intonation and prosody.
Also reported is an original approach to vocoder drift compensation.
arXiv  Detail & Related papers  (2023-07-17T11:38:35Z) - Speaker Embedding-aware Neural Diarization: a Novel Framework for
  Overlapped Speech Diarization in the Meeting Scenario [51.5031673695118]
We reformulate overlapped speech diarization as a single-label prediction problem.
We propose the speaker embedding-aware neural diarization (SEND) system.
arXiv  Detail & Related papers  (2022-03-18T06:40:39Z) - Privacy-Preserving Speech Representation Learning using Vector
  Quantization [0.0]
Speech signals contain a lot of sensitive information, such as the speaker's identity, which raises privacy concerns.
This paper aims to produce an anonymous representation while preserving speech recognition performance.
arXiv  Detail & Related papers  (2022-03-15T14:01:11Z) - End-to-End Diarization for Variable Number of Speakers with Local-Global
  Networks and Discriminative Speaker Embeddings [66.50782702086575]
We present an end-to-end deep network model that performs meeting diarization from single-channel audio recordings.
The proposed system is designed to handle meetings with unknown numbers of speakers, using variable-number permutation-invariant cross-entropy based loss functions.
arXiv  Detail & Related papers  (2021-05-05T14:55:29Z) - Speaker De-identification System using Autoencoders and Adversarial
  Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
 Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv  Detail & Related papers  (2020-11-09T19:22:05Z) - Speaker Anonymization with Distribution-Preserving X-Vector Generation
  for the VoicePrivacy Challenge 2020 [19.420608243033794]
We present a Distribution-Preserving Voice Anonymization technique, as our submission to the VoicePrivacy Challenge 2020.
We show how this approach generates X-vectors that more closely follow the expected intra-similarity distribution of organic speaker X-vectors.
arXiv  Detail & Related papers  (2020-10-26T09:53:56Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.