LR-DWM: Efficient Watermarking for Diffusion Language Models
- URL: http://arxiv.org/abs/2601.12376v1
- Date: Sun, 18 Jan 2026 12:08:51 GMT
- Title: LR-DWM: Efficient Watermarking for Diffusion Language Models
- Authors: Ofek Raban, Ethan Fetaya, Gal Chechik,
- Abstract summary: Diffusion Language Models (DLMs) generate text via non-sequential iterative denoising.<n>Recent work proposed to watermark DLMs by inverting the process when needed, but suffers significant computational or memory overhead.<n>We introduce Left-Right Diffusion Watermarking (LR-DWM), a scheme that biases the generated token based on both left and right neighbors.
- Score: 40.70709965738489
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Watermarking (WM) is a critical mechanism for detecting and attributing AI-generated content. Current WM methods for Large Language Models (LLMs) are predominantly tailored for autoregressive (AR) models: They rely on tokens being generated sequentially, and embed stable signals within the generated sequence based on the previously sampled text. Diffusion Language Models (DLMs) generate text via non-sequential iterative denoising, which requires significant modification to use WM methods designed for AR models. Recent work proposed to watermark DLMs by inverting the process when needed, but suffers significant computational or memory overhead. We introduce Left-Right Diffusion Watermarking (LR-DWM), a scheme that biases the generated token based on both left and right neighbors, when they are available. LR-DWM incurs minimal runtime and memory overhead, remaining close to the non-watermarked baseline DLM while enabling reliable statistical detection under standard evaluation settings. Our results demonstrate that DLMs can be watermarked efficiently, achieving high detectability with negligible computational and memory overhead.
Related papers
- DiffuRank: Effective Document Reranking with Diffusion Language Models [71.16830004674513]
We propose DiffuRank, a reranking framework built upon diffusion language models (dLLMs)<n>dLLMs support more flexible decoding and generation processes that are not constrained to a left-to-right order.<n>We show dLLMs achieve performance comparable to, and in some cases exceeding, that of autoregressive LLMs with similar model sizes.
arXiv Detail & Related papers (2026-02-13T02:18:14Z) - dgMARK: Decoding-Guided Watermarking for Diffusion Language Models [5.43345665278304]
dgMARK is a decoding-guided watermarking method for discrete diffusion language models.<n>dgMARK steers the unmasking order toward positions whose high-reward candidate tokens satisfy a simple parity constraint.<n> Watermarks are detected via elevated parity-matching statistics.
arXiv Detail & Related papers (2026-01-30T13:51:20Z) - Residual Context Diffusion Language Models [90.07635240595926]
Residual Context Diffusion (RCD) is a module that converts discarded token representations into contextual residuals and injects them back for the next denoising step.<n>RCD consistently improves frontier dLLMs by 5-10 points in accuracy with minimal extra computation overhead.
arXiv Detail & Related papers (2026-01-30T13:16:32Z) - DiffuGR: Generative Document Retrieval with Diffusion Language Models [80.78126312115087]
We propose generative document retrieval with diffusion language models, dubbed DiffuGR.<n>For inference, DiffuGR attempts to generate DocID tokens in parallel and refine them through a controllable number of denoising steps.<n>In contrast to conventional left-to-right auto-regressive decoding, DiffuGR provides a novel mechanism to first generate more confident DocID tokens.
arXiv Detail & Related papers (2025-11-11T12:00:09Z) - DMark: Order-Agnostic Watermarking for Diffusion Large Language Models [46.07844536066178]
Diffusion large language models (dLLMs) offer faster generation than autoregressive models while maintaining comparable quality.<n>We present DMark, the first watermarking framework designed specifically for dLLMs.
arXiv Detail & Related papers (2025-10-03T11:14:16Z) - Watermarking Diffusion Language Models [9.515480957792542]
We introduce the first watermark tailored for diffusion language models (DLMs)<n>This is an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially.
arXiv Detail & Related papers (2025-09-29T07:11:40Z) - Yet Another Watermark for Large Language Models [20.295405732813748]
Existing watermarking methods for large language models (LLMs) embed watermark by adjusting the token sampling prediction or post-processing.<n>We present a new watermarking framework for LLMs, where the watermark is embedded into the LLM by manipulating the internal parameters of the LLM.<n>The proposed method entangles the watermark with the intrinsic parameters of the LLM, which better balances the robustness and imperceptibility of the watermark.
arXiv Detail & Related papers (2025-09-16T02:04:55Z) - Accelerating Diffusion LLMs via Adaptive Parallel Decoding [60.407727995313074]
We introduce adaptive parallel decoding (APD), a novel method that dynamically adjusts the number of tokens sampled in parallel.<n>APD provides markedly higher throughput with minimal quality degradations on downstream benchmarks.
arXiv Detail & Related papers (2025-05-31T06:10:10Z) - SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models [4.069844339028727]
SimMark is a robust sentence-level watermarking algorithm for large language models (LLMs)<n>It embeds detectable statistical patterns imperceptible to humans, and employs a soft counting mechanism.<n>We show that SimMark sets a new benchmark for robust watermarking of LLM-generated content.
arXiv Detail & Related papers (2025-02-05T00:21:01Z) - A Watermark for Order-Agnostic Language Models [55.89285889529492]
Pattern-mark is a pattern-based watermarking framework specifically designed for order-agnostic LMs.
We develop a Markov-chain-based watermark generator that produces watermark key sequences with high-frequency key patterns.
Our evaluations on order-agnostic LMs, such as ProteinMPNN and CMLM, demonstrate Pattern-mark's enhanced detection efficiency, generation quality, and robustness.
arXiv Detail & Related papers (2024-10-17T17:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.