Related papers: MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

URL: http://arxiv.org/abs/2601.22246v1
Date: Thu, 29 Jan 2026 19:10:48 GMT
Title: MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models
Authors: Ya Jiang, Massieh Kordi Boroujeny, Surender Suresh Kumar, Kai Zeng,
Abstract summary: We propose MirrorMark, a distortion-free watermark for large language models (LLMs)<n>MirrorMark embeds multi-bit messages without altering the token probability distribution, preserving text quality by design.<n> Experiments show that MirrorMark matches the text quality of non-watermarked generation while achieving substantially stronger detectability.
Score: 5.735801967350819
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As large language models (LLMs) become integral to applications such as question answering and content creation, reliable content attribution has become increasingly important. Watermarking is a promising approach, but existing methods either provide only binary signals or distort the sampling distribution, degrading text quality; distortion-free approaches, in turn, often suffer from weak detectability or robustness. We propose MirrorMark, a multi-bit and distortion-free watermark for LLMs. By mirroring sampling randomness in a measure-preserving manner, MirrorMark embeds multi-bit messages without altering the token probability distribution, preserving text quality by design. To improve robustness, we introduce a context-based scheduler that balances token assignments across message positions while remaining resilient to insertions and deletions. We further provide a theoretical analysis of the equal error rate to interpret empirical performance. Experiments show that MirrorMark matches the text quality of non-watermarked generation while achieving substantially stronger detectability: with 54 bits embedded in 300 tokens, it improves bit accuracy by 8-12% and correctly identifies up to 11% more watermarked texts at 1% false positive rate.

Related papers

MC$^2$Mark: Distortion-Free Multi-Bit Watermarking for Long Messages [62.982950935139534]
Multi-bit watermarking can embed identifiers into generated text, but existing methods struggle to keep both text quality and watermark strength while carrying long messages.<n>We propose MC$2$Mark, a distortion-free multi-bit watermarking framework for reliable embedding and decoding of long messages.
arXiv Detail & Related papers (2026-02-15T07:29:06Z)
Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique [36.96848724920411]
We introduce a text-preserving watermarking framework that embeds invisible Unicode characters into documents.<n>We experimentally observe a failure rate of less than 0.1% when detecting a reply after fine-tuning with 50 marked documents.<n>No spurious reply was recovered in over 18,000 challenges, corresponding to a 100%TPR@0% FPR.
arXiv Detail & Related papers (2025-10-07T08:34:08Z)
An Ensemble Framework for Unbiased Language Model Watermarking [60.99969104552168]
We propose ENS, a novel ensemble framework that enhances the detectability and robustness of unbiased watermarks.<n>ENS sequentially composes multiple independent watermark instances, each governed by a distinct key, to amplify the watermark signal.<n> Empirical evaluations show that ENS substantially reduces the number of tokens needed for reliable detection and increases resistance to smoothing and paraphrasing attacks.
arXiv Detail & Related papers (2025-09-28T19:37:44Z)
PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints [49.2373408329323]
We introduce a new theoretical framework on watermark-leveling (SWM) for large language models (LLMs)<n>We propose PMark, a simple yet powerful SWM method that estimates the median next sentence dynamically through sampling channels.<n> Experimental results show that PMark consistently outperforms existing SWM baselines in both text quality and paraphrasing.
arXiv Detail & Related papers (2025-09-25T12:08:31Z)
SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling [24.603169307967338]
SAEMark is a general framework for post-hoc multi-bit watermarking.<n>It embeds personalized messages solely via inference-time, feature-based rejection sampling.<n>We show SAEMark's consistent performance, with 99.7% F1 on English and strong multi-bit detection accuracy.
arXiv Detail & Related papers (2025-08-11T17:33:18Z)
BiMark: Unbiased Multilayer Watermarking for Large Language Models [68.64050157343334]
We propose BiMark, a novel watermarking framework that balances text quality preservation and message embedding capacity.<n>BiMark achieves up to 30% higher extraction rates for short texts while maintaining text quality indicated by lower perplexity.
arXiv Detail & Related papers (2025-06-19T11:08:59Z)
Improved Unbiased Watermark for Large Language Models [59.00698153097887]
We introduce MCmark, a family of unbiased, Multi-Channel-based watermarks.<n>MCmark preserves the original distribution of the language model.<n>It offers significant improvements in detectability and robustness over existing unbiased watermarks.
arXiv Detail & Related papers (2025-02-16T21:02:36Z)
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models [18.023143082876015]
We propose a dynamic, efficient, and robust multi-bit watermarking method that divides the text into variable-length segments for each watermark bit.<n>Our method reduces the number of tokens required per embedded bit by 25%, reduces watermark embedding time by 50%, and maintains high robustness against text modifications and watermark erasure attacks.
arXiv Detail & Related papers (2025-02-04T11:23:49Z)
GaussMark: A Practical Approach for Structural Watermarking of Language Models [61.84270985214254]
GaussMark is a simple, efficient, and relatively robust scheme for watermarking large language models.<n>We show that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations.
arXiv Detail & Related papers (2025-01-17T22:30:08Z)
Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs) This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text. Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.