Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection
- URL: http://arxiv.org/abs/2403.15955v3
- Date: Sat, 30 Mar 2024 06:42:02 GMT
- Title: Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection
- Authors: Minzhou Pan, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin,
- Abstract summary: WaterMark Detection (WMD) is the first invisible watermark detection method under a black-box and annotation-free setting.
We develop WMD using foundations of offset learning, where a clean non-watermarked dataset enables us to isolate the influence of only watermarked samples.
- Score: 68.90458499700038
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose WaterMark Detection (WMD), the first invisible watermark detection method under a black-box and annotation-free setting. WMD is capable of detecting arbitrary watermarks within a given reference dataset using a clean non-watermarked dataset as a reference, without relying on specific decoding methods or prior knowledge of the watermarking techniques. We develop WMD using foundations of offset learning, where a clean non-watermarked dataset enables us to isolate the influence of only watermarked samples in the reference dataset. Our comprehensive evaluations demonstrate the effectiveness of WMD, significantly outperforming naive detection methods, which only yield AUC scores around 0.5. In contrast, WMD consistently achieves impressive detection AUC scores, surpassing 0.9 in most single-watermark datasets and exceeding 0.7 in more challenging multi-watermark scenarios across diverse datasets and watermarking methods. As invisible watermarks become increasingly prevalent, while specific decoding techniques remain undisclosed, our approach provides a versatile solution and establishes a path toward increasing accountability, transparency, and trust in our digital visual content.
Related papers
- AMUSE: Adaptive Multi-Segment Encoding for Dataset Watermarking [12.2352706636564]
watermarking techniques are used to store ownership information (i.e., watermark) into the individual image samples.
Embedding the entire watermark into all samples leads to significant redundancy in the embedded information.
We propose a multi-segment encoding-decoding method for dataset watermarking (called AMUSE)
Our decoder is then used to reconstruct the original message from the extracted sub-messages.
arXiv Detail & Related papers (2024-03-08T19:02:21Z) - RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees [33.61946642460661]
This paper introduces a robust and agile watermark detection framework, dubbed as RAW.
We employ a classifier that is jointly trained with the watermark to detect the presence of the watermark.
We show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image.
arXiv Detail & Related papers (2024-01-23T22:00:49Z) - WAVES: Benchmarking the Robustness of Image Watermarks [67.955140223443]
WAVES (Watermark Analysis Via Enhanced Stress-testing) is a benchmark for assessing image watermark robustness.
We integrate detection and identification tasks and establish a standardized evaluation protocol comprised of a diverse range of stress tests.
We envision WAVES as a toolkit for the future development of robust watermarks.
arXiv Detail & Related papers (2024-01-16T18:58:36Z) - New Evaluation Metrics Capture Quality Degradation due to LLM
Watermarking [28.53032132891346]
We introduce two new easy-to-use methods for evaluating watermarking algorithms for large-language models (LLMs)
Our experiments, conducted across various datasets, reveal that current watermarking methods are detectable by even simple classifiers.
Our findings underscore the trade-off between watermark robustness and text quality and highlight the importance of having more informative metrics to assess watermarking quality.
arXiv Detail & Related papers (2023-12-04T22:56:31Z) - Turning Your Strength into Watermark: Watermarking Large Language Model via Knowledge Injection [66.26348985345776]
We propose a novel watermarking method for large language models (LLMs) based on knowledge injection.
In the watermark embedding stage, we first embed the watermarks into the selected knowledge to obtain the watermarked knowledge.
In the watermark extraction stage, questions related to the watermarked knowledge are designed, for querying the suspect LLM.
Experiments show that the watermark extraction success rate is close to 100% and demonstrate the effectiveness, fidelity, stealthiness, and robustness of our proposed method.
arXiv Detail & Related papers (2023-11-16T03:22:53Z) - ClearMark: Intuitive and Robust Model Watermarking via Transposed Model
Training [50.77001916246691]
This paper introduces ClearMark, the first DNN watermarking method designed for intuitive human assessment.
ClearMark embeds visible watermarks, enabling human decision-making without rigid value thresholds.
It shows an 8,544-bit watermark capacity comparable to the strongest existing work.
arXiv Detail & Related papers (2023-10-25T08:16:55Z) - On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document.
We find that watermarks remain detectable even after human and machine paraphrasing.
We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z) - Did You Train on My Dataset? Towards Public Dataset Protection with
Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data.
By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders.
This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.