Related papers: Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques

Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques

URL: http://arxiv.org/abs/2308.04455v4
Date: Fri, 1 Mar 2024 16:52:19 GMT
Title: Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques
Authors: Pierre Champion
Abstract summary: The growing use of voice user interfaces has led to a surge in the collection and storage of speech data. This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization.
Score: 1.2691047660244337
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The growing use of voice user interfaces has led to a surge in the collection and storage of speech data. While data collection allows for the development of efficient tools powering most speech services, it also poses serious privacy issues for users as centralized storage makes private personal speech data vulnerable to cyber threats. With the increasing use of voice-based digital assistants like Amazon's Alexa, Google's Home, and Apple's Siri, and with the increasing ease with which personal speech data can be collected, the risk of malicious use of voice-cloning and speaker/gender/pathological/etc. recognition has increased. This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization. In this work, anonymization refers to making personal speech data unlinkable to an identity while maintaining the usefulness (utility) of the speech signal (e.g., access to linguistic content). We start by identifying several challenges that evaluation protocols need to consider to evaluate the degree of privacy protection properly. We clarify how anonymization systems must be configured for evaluation purposes and highlight that many practical deployment configurations do not permit privacy evaluation. Furthermore, we study and examine the most common voice conversion-based anonymization system and identify its weak points before suggesting new methods to overcome some limitations. We isolate all components of the anonymization system to evaluate the degree of speaker PPI associated with each of them. Then, we propose several transformation methods for each component to reduce as much as possible speaker PPI while maintaining utility. We promote anonymization algorithms based on quantization-based transformation as an alternative to the most-used and well-known noise-based approach. Finally, we endeavor a new attack method to invert anonymization.

Related papers

On the Generation and Removal of Speaker Adversarial Perturbation for Voice-Privacy Protection [45.49915832081347]
Recent development in voice-privacy protection has shown the positive use cases of the same technique to conceal speaker's voice attribute. This paper examines the reversibility property where an entity generating adversarial perturbations is authorized to remove them and restore original speech. A similar technique could also be used by an investigator to deanonymize a voice-protected speech to restore criminals' identities in security and forensic analysis.
arXiv Detail & Related papers (2024-12-12T11:46:07Z)
A Benchmark for Multi-speaker Anonymization [9.990701310620368]
We present an attempt to provide a multi-speaker anonymization benchmark. We also discuss the privacy leakage of overlapping conversations. Experiments conducted on both non-overlap simulated and real-world datasets demonstrate the effectiveness of the multi-speaker anonymization system.
arXiv Detail & Related papers (2024-07-08T04:48:43Z)
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding [46.25816642820348]
We focus on altering the voice attributes against machine recognition while retaining human perception. A speech generation framework incorporating a speaker disentanglement mechanism is employed to generate the anonymized speech. Experiments conducted on the LibriSpeech dataset showed that the speaker attributes were obscured with their human perception preserved for 60.71% of the processed utterances.
arXiv Detail & Related papers (2024-06-12T13:33:24Z)
Evaluation of Speaker Anonymization on Emotional Speech [9.223908421919733]
Speech data carries a range of personal information, such as the speaker's identity and emotional state. Current studies have addressed the topic of preserving speech privacy. The VoicePrivacy 2020 Challenge (VPC) is about speaker anonymization.
arXiv Detail & Related papers (2023-04-15T20:50:29Z)
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy [22.84840887071428]
Speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. We propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function.
arXiv Detail & Related papers (2022-10-13T13:12:42Z)
Differentially Private Speaker Anonymization [44.90119821614047]
Sharing real-world speech utterances is key to the training and deployment of voice-based services. Speaker anonymization aims to remove speaker information from a speech utterance while leaving its linguistic and prosodic attributes intact. We show that disentanglement is indeed not perfect: linguistic and prosodic attributes still contain speaker information.
arXiv Detail & Related papers (2022-02-23T23:20:30Z)
Protecting gender and identity with disentangled speech representations [49.00162808063399]
We show that protecting gender information in speech is more effective than modelling speaker-identity information. We present a novel way to encode gender information and disentangle two sensitive biometric identifiers.
arXiv Detail & Related papers (2021-04-22T13:31:41Z)
High Fidelity Speech Regeneration with Application to Speech Enhancement [96.34618212590301]
We propose a wav-to-wav generative model for speech that can generate 24khz speech in a real-time manner. Inspired by voice conversion methods, we train to augment the speech characteristics while preserving the identity of the source.
arXiv Detail & Related papers (2021-01-31T10:54:27Z)
Speaker De-identification System using Autoencoders and Adversarial Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders. Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z)
Speaker anonymisation using the McAdams coefficient [19.168733328810962]
This paper reports an approach to anonymisation that, unlike other current approaches, requires no training data. The proposed solution uses the McAdams coefficient to transform the spectral envelope of speech signals. Results show that random, optimised transformations can outperform competing solutions in terms of anonymisation.
arXiv Detail & Related papers (2020-11-02T17:07:17Z)
Design Choices for X-vector Based Speaker Anonymization [48.46018902334472]
We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge. Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
arXiv Detail & Related papers (2020-05-18T11:32:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.