On the invertibility of a voice privacy system using embedding
alignement
- URL: http://arxiv.org/abs/2110.05431v1
- Date: Fri, 8 Oct 2021 14:43:47 GMT
- Title: On the invertibility of a voice privacy system using embedding
alignement
- Authors: Pierre Champion (MULTISPEECH, LIUM), Thomas Thebaud (LIUM), Ga\"el Le
Lan, Anthony Larcher (LIUM), Denis Jouvet (MULTISPEECH)
- Abstract summary: This paper explores various attack scenarios on a voice anonymization system using embeddings alignment techniques.
We compute the optimal rotation and compare the results of this approximation to the official Voice Privacy Challenge results.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores various attack scenarios on a voice anonymization system
using embeddings alignment techniques. We use Wasserstein-Procrustes (an
algorithm initially designed for unsupervised translation) or Procrustes
analysis to match two sets of x-vectors, before and after voice anonymization,
to mimic this transformation as a rotation function. We compute the optimal
rotation and compare the results of this approximation to the official Voice
Privacy Challenge results. We show that a complex system like the baseline of
the Voice Privacy Challenge can be approximated by a rotation, estimated using
a limited set of x-vectors. This paper studies the space of solutions for voice
anonymization within the specific scope of rotations. Rotations being
reversible, the proposed method can recover up to 62% of the speaker identities
from anonymized embeddings.
Related papers
- Vocoder drift compensation by x-vector alignment in speaker
anonymisation [11.480724899031149]
This paper explores the origin of so-called vocoder drift and shows that it is due to the mismatch between the substituted x-vector and the original representations of the linguistic content, intonation and prosody.
Also reported is an original approach to vocoder drift compensation.
arXiv Detail & Related papers (2023-07-17T11:38:35Z) - Speaker Embedding-aware Neural Diarization: a Novel Framework for
Overlapped Speech Diarization in the Meeting Scenario [51.5031673695118]
We reformulate overlapped speech diarization as a single-label prediction problem.
We propose the speaker embedding-aware neural diarization (SEND) system.
arXiv Detail & Related papers (2022-03-18T06:40:39Z) - Invertible Voice Conversion [12.095003816544919]
In this paper, we propose an invertible deep learning framework called INVVC for voice conversion.
We develop an invertible framework that makes the source identity traceable.
We apply the proposed framework to one-to-one voice conversion and many-to-one conversion using parallel training data.
arXiv Detail & Related papers (2022-01-26T00:25:27Z) - Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for
End-to-End Speech Systems [78.5097679815944]
This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems.
First, we represent speech signals with 2D spectrograms using the short-time Fourier transform.
Second, we iteratively find a safe vector using a spectrogram subspace projection operation.
Third, we synthesize a spectrogram with such a safe vector using a novel GAN architecture trained with Sobolev integral probability metric.
arXiv Detail & Related papers (2021-03-15T01:11:13Z) - Speaker Anonymization with Distribution-Preserving X-Vector Generation
for the VoicePrivacy Challenge 2020 [19.420608243033794]
We present a Distribution-Preserving Voice Anonymization technique, as our submission to the VoicePrivacy Challenge 2020.
We show how this approach generates X-vectors that more closely follow the expected intra-similarity distribution of organic speaker X-vectors.
arXiv Detail & Related papers (2020-10-26T09:53:56Z) - Design Choices for X-vector Based Speaker Anonymization [48.46018902334472]
We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge.
Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
arXiv Detail & Related papers (2020-05-18T11:32:14Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z) - End-to-End Whisper to Natural Speech Conversion using Modified
Transformer Network [0.8399688944263843]
We introduce whisper-to-natural-speech conversion using sequence-to-sequence approach.
We investigate different features like Mel frequency cepstral coefficients and smoothed spectral features.
The proposed networks are trained end-to-end using supervised approach for feature-to-feature transformation.
arXiv Detail & Related papers (2020-04-20T14:47:46Z) - Spatially Adaptive Inference with Stochastic Feature Sampling and
Interpolation [72.40827239394565]
We propose to compute features only at sparsely sampled locations.
We then densely reconstruct the feature map with an efficient procedure.
The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
arXiv Detail & Related papers (2020-03-19T15:36:31Z) - Continuous speech separation: dataset and analysis [52.10378896407332]
In natural conversations, a speech signal is continuous, containing both overlapped and overlap-free components.
This paper describes a dataset and protocols for evaluating continuous speech separation algorithms.
arXiv Detail & Related papers (2020-01-30T18:01:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.