Content Anonymization for Privacy in Long-form Audio
- URL: http://arxiv.org/abs/2510.12780v1
- Date: Tue, 14 Oct 2025 17:52:50 GMT
- Title: Content Anonymization for Privacy in Long-form Audio
- Authors: Cristina Aggazzotti, Ashi Garg, Zexin Cai, Nicholas Andrews,
- Abstract summary: Long-form audio is commonplace in domains such as interviews, phone calls, and meetings.<n> given multiple utterances from the same speaker, an attacker could exploit an individual's vocabulary, syntax, and turns of phrase.<n>We propose new content anonymization approaches to address this risk.
- Score: 9.679458545535388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Voice anonymization techniques have been found to successfully obscure a speaker's acoustic identity in short, isolated utterances in benchmarks such as the VoicePrivacy Challenge. In practice, however, utterances seldom occur in isolation: long-form audio is commonplace in domains such as interviews, phone calls, and meetings. In these cases, many utterances from the same speaker are available, which pose a significantly greater privacy risk: given multiple utterances from the same speaker, an attacker could exploit an individual's vocabulary, syntax, and turns of phrase to re-identify them, even when their voice is completely disguised. To address this risk, we propose new content anonymization approaches. Our approach performs a contextual rewriting of the transcripts in an ASR-TTS pipeline to eliminate speaker-specific style while preserving meaning. We present results in a long-form telephone conversation setting demonstrating the effectiveness of a content-based attack on voice-anonymized speech. Then we show how the proposed content-based anonymization methods can mitigate this risk while preserving speech utility. Overall, we find that paraphrasing is an effective defense against content-based attacks and recommend that stakeholders adopt this step to ensure anonymity in long-form audio.
Related papers
- VoxGuard: Evaluating User and Attribute Privacy in Speech via Membership Inference Attacks [51.68795949691009]
We introduce VoxGuard, a framework grounded in differential privacy and membership inference.<n>For attributes, we show that simple transparent attacks recover gender and accent with near-perfect accuracy even after anonymization.<n>Our results demonstrate that EER substantially underestimates leakage, highlighting the need for low-FPR evaluation.
arXiv Detail & Related papers (2025-09-22T20:57:48Z) - SegReConcat: A Data Augmentation Method for Voice Anonymization Attack [20.139879210234533]
Anonymization of voice seeks to conceal the identity of the speaker while maintaining the utility of speech data.<n>We propose SegReConcat, a data augmentation method for attacker-side enhancement of automatic speaker verification systems.
arXiv Detail & Related papers (2025-08-26T10:26:36Z) - On the Generation and Removal of Speaker Adversarial Perturbation for Voice-Privacy Protection [45.49915832081347]
Recent development in voice-privacy protection has shown the positive use cases of the same technique to conceal speaker's voice attribute.<n>This paper examines the reversibility property where an entity generating adversarial perturbations is authorized to remove them and restore original speech.<n>A similar technique could also be used by an investigator to deanonymize a voice-protected speech to restore criminals' identities in security and forensic analysis.
arXiv Detail & Related papers (2024-12-12T11:46:07Z) - Privacy versus Emotion Preservation Trade-offs in Emotion-Preserving Speaker Anonymization [31.94758615908198]
differential privacy field has explored ways to anonymize speech while preserving its utility.
We develop various speaker anonymization pipelines and find that approaches either excel at anonymization or preserving emotion state, but not both simultaneously.
arXiv Detail & Related papers (2024-09-05T16:10:31Z) - Anonymizing Speech: Evaluating and Designing Speaker Anonymization
Techniques [1.2691047660244337]
The growing use of voice user interfaces has led to a surge in the collection and storage of speech data.
This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization.
arXiv Detail & Related papers (2023-08-05T16:14:17Z) - Anonymizing Speech with Generative Adversarial Networks to Preserve
Speaker Privacy [22.84840887071428]
Speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings.
This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications.
We propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function.
arXiv Detail & Related papers (2022-10-13T13:12:42Z) - Differentially Private Speaker Anonymization [44.90119821614047]
Sharing real-world speech utterances is key to the training and deployment of voice-based services.
Speaker anonymization aims to remove speaker information from a speech utterance while leaving its linguistic and prosodic attributes intact.
We show that disentanglement is indeed not perfect: linguistic and prosodic attributes still contain speaker information.
arXiv Detail & Related papers (2022-02-23T23:20:30Z) - High Fidelity Speech Regeneration with Application to Speech Enhancement [96.34618212590301]
We propose a wav-to-wav generative model for speech that can generate 24khz speech in a real-time manner.
Inspired by voice conversion methods, we train to augment the speech characteristics while preserving the identity of the source.
arXiv Detail & Related papers (2021-01-31T10:54:27Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - Design Choices for X-vector Based Speaker Anonymization [48.46018902334472]
We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge.
Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
arXiv Detail & Related papers (2020-05-18T11:32:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.