Related papers: On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection

On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection

URL: http://arxiv.org/abs/2510.03944v1
Date: Sat, 04 Oct 2025 21:07:06 GMT
Title: On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
Authors: Weiqing He, Xiang Li, Tianqi Shang, Li Shen, Weijie Su, Qi Long,
Abstract summary: We systematically evaluate eight goodness-of-fit (GoF) tests across three popular watermarking schemes.<n>We find that GoF tests can improve both the detection power and robustness of watermark detectors.
Score: 17.920479593691255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) raise concerns about content authenticity and integrity because they can generate human-like text at scale. Text watermarks, which embed detectable statistical signals into generated text, offer a provable way to verify content origin. Many detection methods rely on pivotal statistics that are i.i.d. under human-written text, making goodness-of-fit (GoF) tests a natural tool for watermark detection. However, GoF tests remain largely underexplored in this setting. In this paper, we systematically evaluate eight GoF tests across three popular watermarking schemes, using three open-source LLMs, two datasets, various generation temperatures, and multiple post-editing methods. We find that general GoF tests can improve both the detection power and robustness of watermark detectors. Notably, we observe that text repetition, common in low-temperature settings, gives GoF tests a unique advantage not exploited by existing methods. Our results highlight that classic GoF tests are a simple yet powerful and underused tool for watermark detection in LLMs.

Related papers

How Good is Post-Hoc Watermarking With Language Model Rephrasing? [43.5649433230903]
Generation-time text watermarking embeds statistical signals into text for traceability of AI-generated content.<n>We explore post-hoc watermarking where an LLM rewrites existing text while applying generation-time watermarking.<n>Our strategies achieve strong detectability and semantic fidelity on open-ended text such as books.
arXiv Detail & Related papers (2025-12-18T18:57:33Z)
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation [58.85645136534301]
Existing watermarking schemes for sampled text often face trade-offs between maintaining text quality and ensuring robust detection against various attacks.<n>We propose a novel watermarking scheme that improves both detectability and text quality by introducing a cumulative watermark entropy threshold.
arXiv Detail & Related papers (2025-04-16T14:16:38Z)
GaussMark: A Practical Approach for Structural Watermarking of Language Models [61.84270985214254]
GaussMark is a simple, efficient, and relatively robust scheme for watermarking large language models.<n>We show that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations.
arXiv Detail & Related papers (2025-01-17T22:30:08Z)
Robust Detection of Watermarks for Large Language Models Under Human Edits [27.382399391266564]
We introduce a new method in the form of a truncated goodness-of-fit test for detecting watermarked text under human edits.<n>We prove that the Tr-GoF test achieves optimality in robust detection of the Gumbel-GoF watermark.<n>We also show that the Tr-GoF test attains the highest detection efficiency rate in a certain regime of moderate text modifications.
arXiv Detail & Related papers (2024-11-21T06:06:04Z)
Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs) This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text. Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z)
WaterSeeker: Pioneering Efficient Detection of Watermarked Segments in Large Documents [63.563031923075066]
WaterSeeker is a novel approach to efficiently detect and locate watermarked segments amid extensive natural text.<n>It achieves a superior balance between detection accuracy and computational efficiency.
arXiv Detail & Related papers (2024-09-08T14:45:47Z)
Black-Box Detection of Language Model Watermarks [1.9374282535132377]
We develop rigorous statistical tests to detect, and estimate parameters, of all three popular watermarking scheme families.<n>We experimentally confirm the effectiveness of our methods on a range of schemes and a diverse set of open-source models.<n>Our findings indicate that current watermarking schemes are more detectable than previously believed.
arXiv Detail & Related papers (2024-05-28T08:41:30Z)
On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document. We find that watermarks remain detectable even after human and machine paraphrasing. We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
Who Wrote this Code? Watermarking for Code Generation [53.24895162874416]
We propose Selective WatErmarking via Entropy Thresholding (SWEET) to detect machine-generated text. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines.
arXiv Detail & Related papers (2023-05-24T11:49:52Z)
A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality. It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.