Related papers: Online LLM watermark detection via e-processes

Online LLM watermark detection via e-processes

URL: http://arxiv.org/abs/2602.14286v1
Date: Sun, 15 Feb 2026 19:37:06 GMT
Title: Online LLM watermark detection via e-processes
Authors: Weijie Su, Ruodu Wang, Zinan Zhao,
Abstract summary: We develop a unified framework for watermark detection based on e-processes.<n>We propose various methods to construct empirically adaptive e-processes that can enhance the detection power.<n>Some experiments demonstrate that the proposed framework achieves competitive performance compared to existing watermark detection methods.
Score: 3.0870861759929977
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Watermarking for large language models (LLMs) has emerged as an effective tool for distinguishing AI-generated text from human-written content. Statistically, watermark schemes induce dependence between generated tokens and a pseudo-random sequence, reducing watermark detection to a hypothesis testing problem on independence. We develop a unified framework for LLM watermark detection based on e-processes, providing anytime-valid guarantees for online testing. We propose various methods to construct empirically adaptive e-processes that can enhance the detection power. In addition, theoretical results are established to characterize the power properties of the proposed procedures. Some experiments demonstrate that the proposed framework achieves competitive performance compared to existing watermark detection methods.

Related papers

Watermarks for Language Models via Probabilistic Automata [54.687037560547765]
We introduce a new class of watermarking schemes constructed through probabilistic automata.<n>We present two instantiations: (i) a practical scheme with exponential generation diversity and computational efficiency, and (ii) a theoretical construction with formal undetectability guarantees under cryptographic assumptions.
arXiv Detail & Related papers (2025-12-11T00:49:06Z)
Detecting Post-generation Edits to Watermarked LLM Outputs via Combinatorial Watermarking [51.417096446156926]
We introduce a new task: detecting post-generation edits locally made to watermarked LLM outputs.<n>We propose a pattern-based watermarking framework, which partitions the vocabulary into disjoint subsets and embeds the watermark.<n>We evaluate our method on open-source LLMs across a variety of editing scenarios, demonstrating strong empirical performance in edit localization.
arXiv Detail & Related papers (2025-10-02T03:33:12Z)
Enhancing Watermarking Quality for LLMs via Contextual Generation States Awareness [35.06121005075721]
We introduce a plug-and-play contextual generation states-aware watermarking framework (CAW)<n>First, CAW incorporates a watermarking capacity evaluator, which can assess the impact of embedding messages at different token positions.<n>We introduce a multi-branch pre-generation mechanism to avoid the latency caused by the proposed watermarking strategy.
arXiv Detail & Related papers (2025-06-09T03:53:41Z)
In-Context Watermarks for Large Language Models [71.29952527565749]
In-Context Watermarking (ICW) embeds watermarks into generated text solely through prompt engineering.<n>We investigate four ICW strategies at different levels of granularity, each paired with a tailored detection method.<n>Our experiments validate the feasibility of ICW as a model-agnostic, practical watermarking approach.
arXiv Detail & Related papers (2025-05-22T17:24:51Z)
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation [58.85645136534301]
Existing watermarking schemes for sampled text often face trade-offs between maintaining text quality and ensuring robust detection against various attacks.<n>We propose a novel watermarking scheme that improves both detectability and text quality by introducing a cumulative watermark entropy threshold.
arXiv Detail & Related papers (2025-04-16T14:16:38Z)
Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach [53.32564762183639]
We introduce a novel, unified theoretical framework for watermarking Large Language Models (LLMs)<n>Our approach aims to maximize detection performance while maintaining control over the worst-case false positive rate (FPR) and distortion on text quality.<n>We propose a distortion-free, distribution-adaptive watermarking algorithm (DAWA) that leverages a surrogate model for model-agnosticism and efficiency.
arXiv Detail & Related papers (2024-10-03T18:28:10Z)
Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality [27.592486717044455]
We present a novel type of watermark, Sparse Watermark, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text. Our experimental results demonstrate that the proposed watermarking scheme achieves high detectability while generating text that outperforms previous watermarking methods in quality across various tasks.
arXiv Detail & Related papers (2024-07-17T18:52:12Z)
A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules [27.382399391266564]
We introduce a framework for reasoning about the statistical efficiency of watermarks and powerful detection rules.<n>We derive optimal detection rules for watermarks under our framework.
arXiv Detail & Related papers (2024-04-01T17:03:41Z)
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [31.062753031312006]
Large language models generate high-quality responses with potential misinformation. Watermarking is pivotal in this context, which involves embedding hidden markers in texts. We introduce a novel multi-objective optimization (MOO) approach for watermarking. Our method simultaneously achieves detectability and semantic integrity.
arXiv Detail & Related papers (2024-02-28T05:43:22Z)
An Unforgeable Publicly Verifiable Watermark for Large Language Models [84.2805275589553]
Current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection. We propose an unforgeable publicly verifiable watermark algorithm named UPV that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages.
arXiv Detail & Related papers (2023-07-30T13:43:27Z)
A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality. It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.