Related papers: Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

URL: http://arxiv.org/abs/2210.07002v2
Date: Fri, 14 Oct 2022 13:28:52 GMT
Title: Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Authors: Sarina Meyer, Pascal Tilli, Pavel Denisov, Florian Lux, Julia Koch, Ngoc Thang Vu
Abstract summary: Speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. We propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function.
Score: 22.84840887071428
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. One of the challenges in this context is to create non-existent voices that sound as natural as possible. In this work, we propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function. By incorporating these artificial embeddings into a speech-to-text-to-speech pipeline, we outperform previous approaches in terms of privacy and utility. According to standard objective metrics and human evaluation, our approach generates intelligible and content-preserving yet privacy-protecting versions of the original recordings.

Related papers

On the Generation and Removal of Speaker Adversarial Perturbation for Voice-Privacy Protection [45.49915832081347]
Recent development in voice-privacy protection has shown the positive use cases of the same technique to conceal speaker's voice attribute. This paper examines the reversibility property where an entity generating adversarial perturbations is authorized to remove them and restore original speech. A similar technique could also be used by an investigator to deanonymize a voice-protected speech to restore criminals' identities in security and forensic analysis.
arXiv Detail & Related papers (2024-12-12T11:46:07Z)
Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence. We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context. We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z)
A Benchmark for Multi-speaker Anonymization [9.990701310620368]
We present an attempt to provide a multi-speaker anonymization benchmark for real-world applications. A cascaded system uses speaker diarization to aggregate the speech of each speaker and speaker anonymization to conceal speaker privacy and preserve speech content. Experiments conducted on both non-overlap simulated and real-world datasets demonstrate the effectiveness of the multi-speaker anonymization system.
arXiv Detail & Related papers (2024-07-08T04:48:43Z)
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding [46.25816642820348]
We focus on altering the voice attributes against machine recognition while retaining human perception. A speech generation framework incorporating a speaker disentanglement mechanism is employed to generate the anonymized speech. Experiments conducted on the LibriSpeech dataset showed that the speaker attributes were obscured with their human perception preserved for 60.71% of the processed utterances.
arXiv Detail & Related papers (2024-06-12T13:33:24Z)
Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques [1.2691047660244337]
The growing use of voice user interfaces has led to a surge in the collection and storage of speech data. This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization.
arXiv Detail & Related papers (2023-08-05T16:14:17Z)
ChatGPT for Us: Preserving Data Privacy in ChatGPT via Dialogue Text Ambiguation to Expand Mental Health Care Delivery [52.73936514734762]
ChatGPT has gained popularity for its ability to generate human-like dialogue. Data-sensitive domains face challenges in using ChatGPT due to privacy and data-ownership concerns. We propose a text ambiguation framework that preserves user privacy.
arXiv Detail & Related papers (2023-05-19T02:09:52Z)
Evaluation of Speaker Anonymization on Emotional Speech [9.223908421919733]
Speech data carries a range of personal information, such as the speaker's identity and emotional state. Current studies have addressed the topic of preserving speech privacy. The VoicePrivacy 2020 Challenge (VPC) is about speaker anonymization.
arXiv Detail & Related papers (2023-04-15T20:50:29Z)
Generating gender-ambiguous voices for privacy-preserving speech recognition [38.733077459065704]
We present a generative adversarial network, GenGAN, that synthesises voices that conceal the gender or identity of a speaker. We condition the generator only on gender information and use an adversarial loss between signal distortion and privacy preservation.
arXiv Detail & Related papers (2022-07-03T14:23:02Z)
Differentially Private Speaker Anonymization [44.90119821614047]
Sharing real-world speech utterances is key to the training and deployment of voice-based services. Speaker anonymization aims to remove speaker information from a speech utterance while leaving its linguistic and prosodic attributes intact. We show that disentanglement is indeed not perfect: linguistic and prosodic attributes still contain speaker information.
arXiv Detail & Related papers (2022-02-23T23:20:30Z)
Speaker De-identification System using Autoencoders and Adversarial Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders. Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z)
Design Choices for X-vector Based Speaker Anonymization [48.46018902334472]
We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge. Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
arXiv Detail & Related papers (2020-05-18T11:32:14Z)
Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam [100.95498268200777]
SpeakerBeam exploits an adaptation utterance of the target speaker to extract his/her voice characteristics. SpeakerBeam sometimes fails when speakers have similar voice characteristics, such as in same-gender mixtures. We show experimentally that these strategies greatly improve speech extraction performance, especially for same-gender mixtures.
arXiv Detail & Related papers (2020-01-23T05:36:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.