Improving Security in McAdams Coefficient-Based Speaker Anonymization by
Watermarking Method
- URL: http://arxiv.org/abs/2107.07223v1
- Date: Thu, 15 Jul 2021 09:56:08 GMT
- Title: Improving Security in McAdams Coefficient-Based Speaker Anonymization by
Watermarking Method
- Authors: Candy Olivia Mawalim and Masashi Unoki
- Abstract summary: We propose a method to improve the security for speaker anonymization based on the McAdams coefficient.
The proposed method consists of two main processes: one for embedding and one for detection.
- Score: 8.684378639046642
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Speaker anonymization aims to suppress speaker individuality to protect
privacy in speech while preserving the other aspects, such as speech content.
One effective solution for anonymization is to modify the McAdams coefficient.
In this work, we propose a method to improve the security for speaker
anonymization based on the McAdams coefficient by using a speech watermarking
approach. The proposed method consists of two main processes: one for embedding
and one for detection. In embedding process, two different McAdams coefficients
represent binary bits ``0" and ``1". The watermarked speech is then obtained by
frame-by-frame bit inverse switching. Subsequently, the detection process is
carried out by a power spectrum comparison. We conducted objective evaluations
with reference to the VoicePrivacy 2020 Challenge (VP2020) and of the speech
watermarking with reference to the Information Hiding Challenge (IHC) and found
that our method could satisfy the blind detection, inaudibility, and robustness
requirements in watermarking. It also significantly improved the anonymization
performance in comparison to the secondary baseline system in VP2020.
Related papers
- A Benchmark for Multi-speaker Anonymization [9.990701310620368]
We present an attempt to provide a multi-speaker anonymization benchmark for real-world applications.
A cascaded system uses speaker diarization to aggregate the speech of each speaker and speaker anonymization to conceal speaker privacy and preserve speech content.
Experiments conducted on both non-overlap simulated and real-world datasets demonstrate the effectiveness of the multi-speaker anonymization system.
arXiv Detail & Related papers (2024-07-08T04:48:43Z) - Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding [46.25816642820348]
We focus on altering the voice attributes against machine recognition while retaining human perception.
A speech generation framework incorporating a speaker disentanglement mechanism is employed to generate the anonymized speech.
Experiments conducted on the LibriSpeech dataset showed that the speaker attributes were obscured with their human perception preserved for 60.71% of the processed utterances.
arXiv Detail & Related papers (2024-06-12T13:33:24Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [72.10931780019297]
Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design.
We propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH)
Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on both common and bigram paraphrase attacks, but also is better at preserving the quality of generation.
arXiv Detail & Related papers (2023-10-06T03:33:42Z) - Diff-Privacy: Diffusion-based Face Privacy Protection [58.1021066224765]
In this paper, we propose a novel face privacy protection method based on diffusion models, dubbed Diff-Privacy.
Specifically, we train our proposed multi-scale image inversion module (MSI) to obtain a set of SDM format conditional embeddings of the original image.
Based on the conditional embeddings, we design corresponding embedding scheduling strategies and construct different energy functions during the denoising process to achieve anonymization and visual identity information hiding.
arXiv Detail & Related papers (2023-09-11T09:26:07Z) - WavMark: Watermarking for Audio Generation [70.65175179548208]
This paper introduces an innovative audio watermarking framework that encodes up to 32 bits of watermark within a mere 1-second audio snippet.
The watermark is imperceptible to human senses and exhibits strong resilience against various attacks.
It can serve as an effective identifier for synthesized voices and holds potential for broader applications in audio copyright protection.
arXiv Detail & Related papers (2023-08-24T13:17:35Z) - Anonymizing Speech: Evaluating and Designing Speaker Anonymization
Techniques [1.2691047660244337]
The growing use of voice user interfaces has led to a surge in the collection and storage of speech data.
This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization.
arXiv Detail & Related papers (2023-08-05T16:14:17Z) - Symmetric Saliency-based Adversarial Attack To Speaker Identification [17.087523686496958]
We propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED)
First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system.
Second, it proposes an angular loss function to push the speaker embedding far away from the source speaker.
arXiv Detail & Related papers (2022-10-30T08:54:02Z) - Speaker anonymisation using the McAdams coefficient [19.168733328810962]
This paper reports an approach to anonymisation that, unlike other current approaches, requires no training data.
The proposed solution uses the McAdams coefficient to transform the spectral envelope of speech signals.
Results show that random, optimised transformations can outperform competing solutions in terms of anonymisation.
arXiv Detail & Related papers (2020-11-02T17:07:17Z) - Design Choices for X-vector Based Speaker Anonymization [48.46018902334472]
We present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge.
Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.
arXiv Detail & Related papers (2020-05-18T11:32:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.