Watermarking Diffusion Language Models
- URL: http://arxiv.org/abs/2509.24368v1
- Date: Mon, 29 Sep 2025 07:11:40 GMT
- Title: Watermarking Diffusion Language Models
- Authors: Thibaud Gloaguen, Robin Staab, Nikola Jovanović, Martin Vechev,
- Abstract summary: We introduce the first watermark tailored for diffusion language models (DLMs)<n>This is an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially.
- Score: 9.515480957792542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.
Related papers
- LR-DWM: Efficient Watermarking for Diffusion Language Models [40.70709965738489]
Diffusion Language Models (DLMs) generate text via non-sequential iterative denoising.<n>Recent work proposed to watermark DLMs by inverting the process when needed, but suffers significant computational or memory overhead.<n>We introduce Left-Right Diffusion Watermarking (LR-DWM), a scheme that biases the generated token based on both left and right neighbors.
arXiv Detail & Related papers (2026-01-18T12:08:51Z) - EditMark: Watermarking Large Language Models based on Model Editing [76.04893766374221]
We propose EditMark, the first watermarking method that leverages model editing to embed a training-free, stealthy, and performance-lossless watermark.<n>Experiments indicate that EditMark can embed 32-bit watermarks into LLMs within 20 seconds with a watermark extraction success rate of 100%.
arXiv Detail & Related papers (2025-10-18T06:25:17Z) - DMark: Order-Agnostic Watermarking for Diffusion Large Language Models [46.07844536066178]
Diffusion large language models (dLLMs) offer faster generation than autoregressive models while maintaining comparable quality.<n>We present DMark, the first watermarking framework designed specifically for dLLMs.
arXiv Detail & Related papers (2025-10-03T11:14:16Z) - Yet Another Watermark for Large Language Models [20.295405732813748]
Existing watermarking methods for large language models (LLMs) embed watermark by adjusting the token sampling prediction or post-processing.<n>We present a new watermarking framework for LLMs, where the watermark is embedded into the LLM by manipulating the internal parameters of the LLM.<n>The proposed method entangles the watermark with the intrinsic parameters of the LLM, which better balances the robustness and imperceptibility of the watermark.
arXiv Detail & Related papers (2025-09-16T02:04:55Z) - DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models [18.023143082876015]
We propose a dynamic, efficient, and robust multi-bit watermarking method that divides the text into variable-length segments for each watermark bit.<n>Our method reduces the number of tokens required per embedded bit by 25%, reduces watermark embedding time by 50%, and maintains high robustness against text modifications and watermark erasure attacks.
arXiv Detail & Related papers (2025-02-04T11:23:49Z) - Can Watermarked LLMs be Identified by Users via Crafted Prompts? [55.460327393792156]
This work is the first to investigate the imperceptibility of watermarked Large Language Models (LLMs)<n>We design an identification algorithm called Water-Probe that detects watermarks through well-designed prompts.<n> Experiments show that almost all mainstream watermarking algorithms are easily identified with our well-designed prompts.
arXiv Detail & Related papers (2024-10-04T06:01:27Z) - Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality [27.592486717044455]
We present a novel type of watermark, Sparse Watermark, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text.
Our experimental results demonstrate that the proposed watermarking scheme achieves high detectability while generating text that outperforms previous watermarking methods in quality across various tasks.
arXiv Detail & Related papers (2024-07-17T18:52:12Z) - PostMark: A Robust Blackbox Watermark for Large Language Models [56.63560134428716]
We develop PostMark, a modular post-hoc watermarking procedure.
PostMark does not require logit access, which means it can be implemented by a third party.
We show that PostMark is more robust to paraphrasing attacks than existing watermarking methods.
arXiv Detail & Related papers (2024-06-20T17:27:14Z) - Large Language Model Watermark Stealing With Mixed Integer Programming [51.336009662771396]
Large Language Model (LLM) watermark shows promise in addressing copyright, monitoring AI-generated text, and preventing its misuse.
Recent research indicates that watermarking methods using numerous keys are susceptible to removal attacks.
We propose a novel green list stealing attack against the state-of-the-art LLM watermark scheme.
arXiv Detail & Related papers (2024-05-30T04:11:17Z) - Turning Your Strength into Watermark: Watermarking Large Language Model via Knowledge Injection [66.26348985345776]
We propose a novel watermarking method for large language models (LLMs) based on knowledge injection.
In the watermark embedding stage, we first embed the watermarks into the selected knowledge to obtain the watermarked knowledge.
In the watermark extraction stage, questions related to the watermarked knowledge are designed, for querying the suspect LLM.
Experiments show that the watermark extraction success rate is close to 100% and demonstrate the effectiveness, fidelity, stealthiness, and robustness of our proposed method.
arXiv Detail & Related papers (2023-11-16T03:22:53Z) - A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks.
There are concerns that LLMs are possible to be used improperly or even illegally.
We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z) - A Semantic Invariant Robust Watermark for Large Language Models [27.522264953691746]
Prior watermark algorithms face a trade-off between attack robustness and security robustness.
This is because the watermark logits for a token are determined by a certain number of preceding tokens.
We propose a semantic invariant watermarking method for LLMs that provides both attack robustness and security robustness.
arXiv Detail & Related papers (2023-10-10T06:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.