Related papers: A Transfer Attack to Image Watermarks

A Transfer Attack to Image Watermarks

URL: http://arxiv.org/abs/2403.15365v3
Date: Thu, 12 Sep 2024 17:49:05 GMT
Title: A Transfer Attack to Image Watermarks
Authors: Yuepeng Hu, Zhengyuan Jiang, Moyang Guo, Neil Gong,
Abstract summary: We propose a new transfer evasion attack to image watermark in the no-box setting. Our major contribution is to show that, both theoretically and empirically, watermark-based AI-generated image detector is not robust to evasion attacks.
Score: 1.656188668325832
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Watermark has been widely deployed by industry to detect AI-generated images. The robustness of such watermark-based detector against evasion attacks in the white-box and black-box settings is well understood in the literature. However, the robustness in the no-box setting is much less understood. In this work, we propose a new transfer evasion attack to image watermark in the no-box setting. Our transfer attack adds a perturbation to a watermarked image to evade multiple surrogate watermarking models trained by the attacker itself, and the perturbed watermarked image also evades the target watermarking model. Our major contribution is to show that, both theoretically and empirically, watermark-based AI-generated image detector is not robust to evasion attacks even if the attacker does not have access to the watermarking model nor the detection API.

Related papers

Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models [42.902365202924535]
We investigate watermark forging in the context of widely used post-hoc image watermarking.<n>We introduce a preference model to assess whether an image is watermarked.<n>We demonstrate the model's capability to remove and forge watermarks by optimizing the input image through backpropagation.
arXiv Detail & Related papers (2025-10-23T12:06:35Z)
Invisible Watermarks: Attacks and Robustness [0.3495246564946556]
We introduce novel improvements to watermarking robustness and minimize degradation on image quality during attack. We propose a custom watermark remover network which preserves one of the watermarking modalities while completely removing the other during decoding. Our evaluation suggests that 1) implementing the watermark remover model to preserve one of the watermark modalities when decoding the other modality slightly improves on the baseline performance, and that 2) LBA degrades the image significantly less compared to uniform blurring of the entire image.
arXiv Detail & Related papers (2024-12-17T03:50:13Z)
Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models [16.57738116313139]
We show that attackers can leverage unrelated models, even with different latent spaces and architectures, to perform powerful and realistic forgery attacks. The first imprints a targeted watermark into real images by manipulating the latent representation of an arbitrary image in an unrelated LDM. The second attack generates new images with the target watermark by inverting a watermarked image and re-generating it with an arbitrary prompt.
arXiv Detail & Related papers (2024-12-04T12:57:17Z)
An undetectable watermark for generative image models [65.31658824274894]
We present the first undetectable watermarking scheme for generative image models. In particular, an undetectable watermark does not degrade image quality under any efficiently computable metric. Our scheme works by selecting the initial latents of a diffusion model using a pseudorandom error-correcting code.
arXiv Detail & Related papers (2024-10-09T18:33:06Z)
Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns. Watermarking AI-generated content is a key technology to address these concerns. We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z)
UnMarker: A Universal Attack on Defensive Image Watermarking [4.013156524547072]
We present UnMarker -- the first practical universal attack on defensive watermarking. UnMarker requires no detector feedback, no unrealistic knowledge of the watermarking scheme or similar models, and no advanced denoising pipelines. Evaluations against SOTA schemes prove UnMarker's effectiveness.
arXiv Detail & Related papers (2024-05-14T07:05:18Z)
Wide Flat Minimum Watermarking for Robust Ownership Verification of GANs [23.639074918667625]
We propose a novel multi-bit box-free watermarking method for GANs with improved robustness against white-box attacks. The watermark is embedded by adding an extra watermarking loss term during GAN training. We show that the presence of the watermark has a negligible impact on the quality of the generated images.
arXiv Detail & Related papers (2023-10-25T18:38:10Z)
Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks [47.04650443491879]
We analyze the robustness of various AI-image detectors including watermarking and deepfake detectors. We show that watermarking methods are vulnerable to spoofing attacks where the attacker aims to have real images identified as watermarked ones.
arXiv Detail & Related papers (2023-09-29T18:30:29Z)
Invisible Image Watermarks Are Provably Removable Using Generative AI [47.25747266531665]
Invisible watermarks safeguard images' copyrights by embedding hidden messages only detectable by owners. We propose a family of regeneration attacks to remove these invisible watermarks. The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image.
arXiv Detail & Related papers (2023-06-02T23:29:28Z)
Certified Neural Network Watermarks with Randomized Smoothing [64.86178395240469]
We propose a certifiable watermarking method for deep learning models. We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold. Our watermark is also empirically more robust compared to previous watermarking methods.
arXiv Detail & Related papers (2022-07-16T16:06:59Z)
Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective. We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations. Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.