Related papers: Certified Neural Network Watermarks with Randomized Smoothing

Certified Neural Network Watermarks with Randomized Smoothing

URL: http://arxiv.org/abs/2207.07972v1
Date: Sat, 16 Jul 2022 16:06:59 GMT
Title: Certified Neural Network Watermarks with Randomized Smoothing
Authors: Arpit Bansal, Ping-yeh Chiang, Michael Curry, Rajiv Jain, Curtis Wigington, Varun Manjunatha, John P Dickerson, Tom Goldstein
Abstract summary: We propose a certifiable watermarking method for deep learning models. We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold. Our watermark is also empirically more robust compared to previous watermarking methods.
Score: 64.86178395240469
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Watermarking is a commonly used strategy to protect creators' rights to digital images, videos and audio. Recently, watermarking methods have been extended to deep learning models -- in principle, the watermark should be preserved when an adversary tries to copy the model. However, in practice, watermarks can often be removed by an intelligent adversary. Several papers have proposed watermarking methods that claim to be empirically resistant to different types of removal attacks, but these new techniques often fail in the face of new or better-tuned adversaries. In this paper, we propose a certifiable watermarking method. Using the randomized smoothing technique proposed in Chiang et al., we show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold. In addition to being certifiable, our watermark is also empirically more robust compared to previous watermarking methods. Our experiments can be reproduced with code at https://github.com/arpitbansal297/Certified_Watermarks

Related papers

Forging and Removing Latent-Noise Diffusion Watermarks Using a Single Image [24.513881574366046]
Previous watermarking schemes embed a secret key in the initial noise. We propose a black-box adversarial attack without presuming access to the diffusion model weights. We show that we can also apply a similar approach for watermark removal by learning perturbations to exit this region.
arXiv Detail & Related papers (2025-04-27T18:29:26Z)
SEAL: Semantic Aware Image Watermarking [26.606008778795193]
We propose a novel watermarking method that embeds semantic information about the generated image directly into the watermark. The key pattern can be inferred from the semantic embedding of the image using locality-sensitive hashing. Our results suggest that content-aware watermarks can mitigate risks arising from image-generative models.
arXiv Detail & Related papers (2025-03-15T15:29:05Z)
An undetectable watermark for generative image models [65.31658824274894]
We present the first undetectable watermarking scheme for generative image models. In particular, an undetectable watermark does not degrade image quality under any efficiently computable metric. Our scheme works by selecting the initial latents of a diffusion model using a pseudorandom error-correcting code.
arXiv Detail & Related papers (2024-10-09T18:33:06Z)
Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns. Watermarking AI-generated content is a key technology to address these concerns. We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z)
UnMarker: A Universal Attack on Defensive Image Watermarking [4.013156524547072]
We present UnMarker -- the first practical universal attack on defensive watermarking. UnMarker requires no detector feedback, no unrealistic knowledge of the watermarking scheme or similar models, and no advanced denoising pipelines. Evaluations against SOTA schemes prove UnMarker's effectiveness.
arXiv Detail & Related papers (2024-05-14T07:05:18Z)
Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model. We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior. Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z)
Invisible Image Watermarks Are Provably Removable Using Generative AI [47.25747266531665]
Invisible watermarks safeguard images' copyrights by embedding hidden messages only detectable by owners. We propose a family of regeneration attacks to remove these invisible watermarks. The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image.
arXiv Detail & Related papers (2023-06-02T23:29:28Z)
Piracy-Resistant DNN Watermarking by Block-Wise Image Transformation with Secret Key [15.483078145498085]
The proposed method embeds a watermark pattern in a model by using learnable transformed images. It is piracy-resistant, so the original watermark cannot be overwritten by a pirated watermark. The results show that it was resilient against fine-tuning and pruning attacks while maintaining a high watermark-detection accuracy.
arXiv Detail & Related papers (2021-04-09T08:21:53Z)
Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective. We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations. Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.