Self Voice Conversion as an Attack against Neural Audio Watermarking
- URL: http://arxiv.org/abs/2601.20432v1
- Date: Wed, 28 Jan 2026 09:41:18 GMT
- Title: Self Voice Conversion as an Attack against Neural Audio Watermarking
- Authors: Yigitcan Özer, Wanying Ge, Zhe Zhang, Xin Wang, Junichi Yamagishi,
- Abstract summary: We investigate self voice conversion as a universal, content-preserving attack against audio watermarking systems.<n>We demonstrate that this attack severely degrades the reliability of state-of-the-art watermarking approaches.
- Score: 34.948149764638806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Audio watermarking embeds auxiliary information into speech while maintaining speaker identity, linguistic content, and perceptual quality. Although recent advances in neural and digital signal processing-based watermarking methods have improved imperceptibility and embedding capacity, robustness is still primarily assessed against conventional distortions such as compression, additive noise, and resampling. However, the rise of deep learning-based attacks introduces novel and significant threats to watermark security. In this work, we investigate self voice conversion as a universal, content-preserving attack against audio watermarking systems. Self voice conversion remaps a speaker's voice to the same identity while altering acoustic characteristics through a voice conversion model. We demonstrate that this attack severely degrades the reliability of state-of-the-art watermarking approaches and highlight its implications for the security of modern audio watermarking techniques.
Related papers
- Latent-Mark: An Audio Watermark Robust to Neural Resynthesis [62.09761127079914]
Latent-Mark is the first zero-bit audio watermarking framework designed to survive semantic compression.<n>Our key insight is that robustness to the encode-decode process requires embedding the watermark within the invariant latent space.<n>Our work inspires future research into universal watermarking frameworks capable of maintaining integrity across increasingly complex and diverse generative distortions.
arXiv Detail & Related papers (2026-03-05T15:51:09Z) - AWARE: Audio Watermarking with Adversarial Resistance to Edits [0.0]
AWARE (Audio Watermarking with Adrial Resistance to Edits) is an approach that avoids reliance on attack-versa stacks and handcrafted differentiable distortions.<n> Embedding is obtained via adversarial optimization in the time-frequency domain under a level-proportional budget.<n>AWARE attains high audio quality and speech intelligibility (PESQ/STOI) and consistently low BER across various audio edits.
arXiv Detail & Related papers (2025-10-20T13:10:52Z) - Yours or Mine? Overwriting Attacks against Neural Audio Watermarking [21.297468818273064]
We develop a simple yet powerful attack that overwrites the legitimate audio watermark with a forged one.<n>Based on the audio watermarking information that the adversary has, we propose three categories of overwriting attacks.<n> Experimental results demonstrate that the proposed overwriting attacks can effectively compromise existing watermarking schemes.
arXiv Detail & Related papers (2025-09-06T21:23:44Z) - Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion [53.26424100244925]
Expressive voice conversion aims to transfer both speaker identity and expressive attributes from a target speech to a given source speech.<n>In this work, we improve over a self-supervised, non-autoregressive framework with a conditional variational autoencoder.
arXiv Detail & Related papers (2025-06-04T14:42:12Z) - XAttnMark: Learning Robust Audio Watermarking with Cross-Attention [15.216472445154064]
Cross-Attention Robust Audio Watermark (XAttnMark)<n>This paper introduces Cross-Attention Robust Audio Watermark (XAttnMark), which bridges the gap by leveraging partial parameter sharing between the generator and the detector.<n>We propose a psychoacoustic-aligned temporal-frequency masking loss that captures fine-grained auditory masking effects, enhancing watermark imperceptibility.
arXiv Detail & Related papers (2025-02-06T17:15:08Z) - AudioMarkBench: Benchmarking Robustness of Audio Watermarking [38.25450275151647]
We present AudioMarkBench, the first systematic benchmark for evaluating the robustness of audio watermarking against watermark removal and watermark forgery.
Our findings highlight the vulnerabilities of current watermarking techniques and emphasize the need for more robust and fair audio watermarking solutions.
arXiv Detail & Related papers (2024-06-11T06:18:29Z) - Proactive Detection of Voice Cloning with Localized Watermarking [50.13539630769929]
We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech.
AudioSeal employs a generator/detector architecture trained jointly with a localization loss to enable localized watermark detection up to the sample level.
AudioSeal achieves state-of-the-art performance in terms of robustness to real life audio manipulations and imperceptibility based on automatic and human evaluation metrics.
arXiv Detail & Related papers (2024-01-30T18:56:22Z) - WavMark: Watermarking for Audio Generation [70.65175179548208]
This paper introduces an innovative audio watermarking framework that encodes up to 32 bits of watermark within a mere 1-second audio snippet.
The watermark is imperceptible to human senses and exhibits strong resilience against various attacks.
It can serve as an effective identifier for synthesized voices and holds potential for broader applications in audio copyright protection.
arXiv Detail & Related papers (2023-08-24T13:17:35Z) - High Fidelity Speech Regeneration with Application to Speech Enhancement [96.34618212590301]
We propose a wav-to-wav generative model for speech that can generate 24khz speech in a real-time manner.
Inspired by voice conversion methods, we train to augment the speech characteristics while preserving the identity of the source.
arXiv Detail & Related papers (2021-01-31T10:54:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.