PTW: Pivotal Tuning Watermarking for Pre-Trained Image Generators
- URL: http://arxiv.org/abs/2304.07361v3
- Date: Tue, 7 Nov 2023 20:15:10 GMT
- Title: PTW: Pivotal Tuning Watermarking for Pre-Trained Image Generators
- Authors: Nils Lukas, Florian Kerschbaum
- Abstract summary: We propose Pivotal Tuning Watermarking (PTW), a method for watermarking pre-trained generators.
PTW can embed longer codes than existing methods while better preserving the generator's image quality.
We propose rigorous, game-based definitions for robustness and undetectability, and our study reveals that watermarking is not robust against an adaptive white-box attacker.
- Score: 42.0915430715226
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deepfakes refer to content synthesized using deep generators, which, when
misused, have the potential to erode trust in digital media. Synthesizing
high-quality deepfakes requires access to large and complex generators only a
few entities can train and provide. The threat is malicious users that exploit
access to the provided model and generate harmful deepfakes without risking
detection. Watermarking makes deepfakes detectable by embedding an identifiable
code into the generator that is later extractable from its generated images. We
propose Pivotal Tuning Watermarking (PTW), a method for watermarking
pre-trained generators (i) three orders of magnitude faster than watermarking
from scratch and (ii) without the need for any training data. We improve
existing watermarking methods and scale to generators $4 \times$ larger than
related work. PTW can embed longer codes than existing methods while better
preserving the generator's image quality. We propose rigorous, game-based
definitions for robustness and undetectability, and our study reveals that
watermarking is not robust against an adaptive white-box attacker who controls
the generator's parameters. We propose an adaptive attack that can successfully
remove any watermarking with access to only 200 non-watermarked images. Our
work challenges the trustworthiness of watermarking for deepfake detection when
the parameters of a generator are available. The source code to reproduce our
experiments is available at https://github.com/nilslukas/gan-watermark.
Related papers
- On the Learnability of Watermarks for Language Models [80.97358663708592]
We ask whether language models can directly learn to generate watermarked text.
We propose watermark distillation, which trains a student model to behave like a teacher model.
We find that models can learn to generate watermarked text with high detectability.
arXiv Detail & Related papers (2023-12-07T17:41:44Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - An Unforgeable Publicly Verifiable Watermark for Large Language Models [84.2805275589553]
Current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection.
We propose an unforgeable publicly verifiable watermark algorithm named UPV that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages.
arXiv Detail & Related papers (2023-07-30T13:43:27Z) - Invisible Image Watermarks Are Provably Removable Using Generative AI [47.25747266531665]
Invisible watermarks safeguard images' copyrights by embedding hidden messages only detectable by owners.
We propose a family of regeneration attacks to remove these invisible watermarks.
The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image.
arXiv Detail & Related papers (2023-06-02T23:29:28Z) - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are
Invisible and Robust [55.91987293510401]
Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content.
We introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs.
Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed.
arXiv Detail & Related papers (2023-05-31T17:00:31Z) - Who Wrote this Code? Watermarking for Code Generation [53.24895162874416]
We propose Selective WatErmarking via Entropy Thresholding (SWEET) to detect machine-generated text.
Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines.
arXiv Detail & Related papers (2023-05-24T11:49:52Z) - Supervised GAN Watermarking for Intellectual Property Protection [33.827150843939094]
We propose a watermarking method for Generative Adversarial Networks (GANs)
The aim is to watermark the GAN model so that any image generated by the GAN contains an invisible watermark (signature)
Results show that our method can effectively embed an invisible watermark inside the generated images.
arXiv Detail & Related papers (2022-09-07T20:52:05Z) - Piracy-Resistant DNN Watermarking by Block-Wise Image Transformation
with Secret Key [15.483078145498085]
The proposed method embeds a watermark pattern in a model by using learnable transformed images.
It is piracy-resistant, so the original watermark cannot be overwritten by a pirated watermark.
The results show that it was resilient against fine-tuning and pruning attacks while maintaining a high watermark-detection accuracy.
arXiv Detail & Related papers (2021-04-09T08:21:53Z) - Watermark Faker: Towards Forgery of Digital Image Watermarking [10.14145437847397]
We make the first attempt to develop digital image watermark fakers by using generative adversarial learning.
Our experiments show that the proposed watermark faker can effectively crack digital image watermarkers in both spatial and frequency domains.
arXiv Detail & Related papers (2021-03-23T12:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.