Evading Watermark based Detection of AI-Generated Content
- URL: http://arxiv.org/abs/2305.03807v5
- Date: Wed, 8 Nov 2023 15:23:10 GMT
- Title: Evading Watermark based Detection of AI-Generated Content
- Authors: Zhengyuan Jiang, Jinghuai Zhang, Neil Zhenqiang Gong
- Abstract summary: A generative AI model can generate extremely realistic-looking content.
Watermark has been leveraged to detect AI-generated content.
A content is detected as AI-generated if a similar watermark can be decoded from it.
- Score: 45.47476727209842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A generative AI model can generate extremely realistic-looking content,
posing growing challenges to the authenticity of information. To address the
challenges, watermark has been leveraged to detect AI-generated content.
Specifically, a watermark is embedded into an AI-generated content before it is
released. A content is detected as AI-generated if a similar watermark can be
decoded from it. In this work, we perform a systematic study on the robustness
of such watermark-based AI-generated content detection. We focus on
AI-generated images. Our work shows that an attacker can post-process a
watermarked image via adding a small, human-imperceptible perturbation to it,
such that the post-processed image evades detection while maintaining its
visual quality. We show the effectiveness of our attack both theoretically and
empirically. Moreover, to evade detection, our adversarial post-processing
method adds much smaller perturbations to AI-generated images and thus better
maintain their visual quality than existing popular post-processing methods
such as JPEG compression, Gaussian blur, and Brightness/Contrast. Our work
shows the insufficiency of existing watermark-based detection of AI-generated
content, highlighting the urgent needs of new methods. Our code is publicly
available: https://github.com/zhengyuan-jiang/WEvade.
Related papers
- Invisible Watermarks: Attacks and Robustness [0.3495246564946556]
We introduce novel improvements to watermarking robustness and minimize degradation on image quality during attack.
We propose a custom watermark remover network which preserves one of the watermarking modalities while completely removing the other during decoding.
Our evaluation suggests that 1) implementing the watermark remover model to preserve one of the watermark modalities when decoding the other modality slightly improves on the baseline performance, and that 2) LBA degrades the image significantly less compared to uniform blurring of the entire image.
arXiv Detail & Related papers (2024-12-17T03:50:13Z) - SoK: Watermarking for AI-Generated Content [112.9218881276487]
Watermarking schemes embed hidden signals within AI-generated content to enable reliable detection.
Watermarks can play a crucial role in enhancing AI safety and trustworthiness by combating misinformation and deception.
This work aims to guide researchers in advancing watermarking methods and applications, and support policymakers in addressing the broader implications of GenAI.
arXiv Detail & Related papers (2024-11-27T16:22:33Z) - Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns.
Watermarking AI-generated content is a key technology to address these concerns.
We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z) - A Sanity Check for AI-generated Image Detection [49.08585395873425]
We propose AIDE (AI-generated Image DEtector with Hybrid Features) to detect AI-generated images.
AIDE achieves +3.5% and +4.6% improvements to state-of-the-art methods.
arXiv Detail & Related papers (2024-06-27T17:59:49Z) - Watermark-based Attribution of AI-Generated Content [34.913290430783185]
We conduct the first systematic study on watermark-based, user-level attribution of AI-generated content.
Our key idea is to assign a unique watermark to each user of the GenAI service and embed this watermark into the AI-generated content created by that user.
Attribution is then performed by identifying the user whose watermark best matches the one extracted from the given content.
arXiv Detail & Related papers (2024-04-05T17:58:52Z) - Robustness of AI-Image Detectors: Fundamental Limits and Practical
Attacks [47.04650443491879]
We analyze the robustness of various AI-image detectors including watermarking and deepfake detectors.
We show that watermarking methods are vulnerable to spoofing attacks where the attacker aims to have real images identified as watermarked ones.
arXiv Detail & Related papers (2023-09-29T18:30:29Z) - Invisible Image Watermarks Are Provably Removable Using Generative AI [47.25747266531665]
Invisible watermarks safeguard images' copyrights by embedding hidden messages only detectable by owners.
We propose a family of regeneration attacks to remove these invisible watermarks.
The proposed attack method first adds random noise to an image to destroy the watermark and then reconstructs the image.
arXiv Detail & Related papers (2023-06-02T23:29:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.