Watermarking Training Data of Music Generation Models
- URL: http://arxiv.org/abs/2412.08549v2
- Date: Thu, 12 Dec 2024 10:49:10 GMT
- Title: Watermarking Training Data of Music Generation Models
- Authors: Pascal Epple, Igor Shilov, Bozhidar Stevanoski, Yves-Alexandre de Montjoye,
- Abstract summary: We investigate whether audio watermarking techniques can be used to detect unauthorized usage of content.
We compare outputs generated by a model trained on watermarked data to a model trained on non-watermarked data.
Our results show that audio watermarking techniques, including some that are imperceptible to humans, can lead to noticeable shifts in the model's outputs.
- Score: 6.902279764206365
- License:
- Abstract: Generative Artificial Intelligence (Gen-AI) models are increasingly used to produce content across domains, including text, images, and audio. While these models represent a major technical breakthrough, they gain their generative capabilities from being trained on enormous amounts of human-generated content, which often includes copyrighted material. In this work, we investigate whether audio watermarking techniques can be used to detect an unauthorized usage of content to train a music generation model. We compare outputs generated by a model trained on watermarked data to a model trained on non-watermarked data. We study factors that impact the model's generation behaviour: the watermarking technique, the proportion of watermarked samples in the training set, and the robustness of the watermarking technique against the model's tokenizer. Our results show that audio watermarking techniques, including some that are imperceptible to humans, can lead to noticeable shifts in the model's outputs. We also study the robustness of a state-of-the-art watermarking technique to removal techniques.
Related papers
- Dynamic watermarks in images generated by diffusion models [46.1135899490656]
High-fidelity text-to-image diffusion models have revolutionized visual content generation, but their widespread use raises significant ethical concerns.
We propose a novel multi-stage watermarking framework for diffusion models, designed to establish copyright and trace generated images back to their source.
Our work advances the field of AI-generated content security by providing a scalable solution for model ownership verification and misuse prevention.
arXiv Detail & Related papers (2025-02-13T03:23:17Z) - Image Watermarking of Generative Diffusion Models [42.982489491857145]
We propose a watermarking technique that embeds watermark features into the diffusion model itself.
Our technique enables training of a paired watermark extractor for a generative model that is learned through an end-to-end process.
We demonstrate highly accurate watermark embedding/detection and show that it is also possible to distinguish between different watermarks embedded with our method to differentiate between generative models.
arXiv Detail & Related papers (2025-02-12T09:00:48Z) - Watermarking across Modalities for Content Tracing and Generative AI [2.456311843339488]
This thesis includes the development of new watermarking techniques for images, audio, and text.
We first introduce methods for active moderation of images on social platforms.
We then develop specific techniques for AI-generated content.
arXiv Detail & Related papers (2025-02-04T18:49:50Z) - SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a novel framework designed to embed resilient watermarks into T2I diffusion models.
It guides the model to disentangle the watermark information from the semantic concepts it learns, allowing the model to retain the embedded watermark.
Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z) - How to Trace Latent Generative Model Generated Images without Artificial Watermark? [88.04880564539836]
Concerns have arisen regarding potential misuse related to images generated by latent generative models.
We propose a latent inversion based method called LatentTracer to trace the generated images of the inspected model.
Our experiments show that our method can distinguish the images generated by the inspected model and other images with a high accuracy and efficiency.
arXiv Detail & Related papers (2024-05-22T05:33:47Z) - ProMark: Proactive Diffusion Watermarking for Causal Attribution [25.773438257321793]
We propose ProMark, a causal attribution technique to attribute a synthetically generated image to its training data concepts.
The concept information is proactively embedded into the input training images using imperceptible watermarks.
We show that we can embed as many as $216$ unique watermarks into the training data, and each training image can contain more than one watermark.
arXiv Detail & Related papers (2024-03-14T23:16:43Z) - On the Learnability of Watermarks for Language Models [80.97358663708592]
We ask whether language models can directly learn to generate watermarked text.
We propose watermark distillation, which trains a student model to behave like a teacher model.
We find that models can learn to generate watermarked text with high detectability.
arXiv Detail & Related papers (2023-12-07T17:41:44Z) - ClearMark: Intuitive and Robust Model Watermarking via Transposed Model
Training [50.77001916246691]
This paper introduces ClearMark, the first DNN watermarking method designed for intuitive human assessment.
ClearMark embeds visible watermarks, enabling human decision-making without rigid value thresholds.
It shows an 8,544-bit watermark capacity comparable to the strongest existing work.
arXiv Detail & Related papers (2023-10-25T08:16:55Z) - Invisible Watermarking for Audio Generation Diffusion Models [11.901028740065662]
This paper presents the first watermarking technique applied to audio diffusion models trained on mel-spectrograms.
Our model excels not only in benign audio generation, but also incorporates an invisible watermarking trigger mechanism for model verification.
arXiv Detail & Related papers (2023-09-22T20:10:46Z) - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are
Invisible and Robust [55.91987293510401]
Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content.
We introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs.
Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed.
arXiv Detail & Related papers (2023-05-31T17:00:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.