SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models
- URL: http://arxiv.org/abs/2502.02787v2
- Date: Thu, 11 Sep 2025 04:36:56 GMT
- Title: SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models
- Authors: Amirhossein Dabiriaghdam, Lele Wang,
- Abstract summary: SimMark is a robust sentence-level watermarking algorithm for large language models (LLMs)<n>It embeds detectable statistical patterns imperceptible to humans, and employs a soft counting mechanism.<n>We show that SimMark sets a new benchmark for robust watermarking of LLM-generated content.
- Score: 4.069844339028727
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The widespread adoption of large language models (LLMs) necessitates reliable methods to detect LLM-generated text. We introduce SimMark, a robust sentence-level watermarking algorithm that makes LLMs' outputs traceable without requiring access to model internals, making it compatible with both open and API-based LLMs. By leveraging the similarity of semantic sentence embeddings combined with rejection sampling to embed detectable statistical patterns imperceptible to humans, and employing a soft counting mechanism, SimMark achieves robustness against paraphrasing attacks. Experimental results demonstrate that SimMark sets a new benchmark for robust watermarking of LLM-generated content, surpassing prior sentence-level watermarking techniques in robustness, sampling efficiency, and applicability across diverse domains, all while maintaining the text quality and fluency.
Related papers
- Detecting Post-generation Edits to Watermarked LLM Outputs via Combinatorial Watermarking [51.417096446156926]
We introduce a new task: detecting post-generation edits locally made to watermarked LLM outputs.<n>We propose a pattern-based watermarking framework, which partitions the vocabulary into disjoint subsets and embeds the watermark.<n>We evaluate our method on open-source LLMs across a variety of editing scenarios, demonstrating strong empirical performance in edit localization.
arXiv Detail & Related papers (2025-10-02T03:33:12Z) - PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints [49.2373408329323]
We introduce a new theoretical framework on watermark-leveling (SWM) for large language models (LLMs)<n>We propose PMark, a simple yet powerful SWM method that estimates the median next sentence dynamically through sampling channels.<n> Experimental results show that PMark consistently outperforms existing SWM baselines in both text quality and paraphrasing.
arXiv Detail & Related papers (2025-09-25T12:08:31Z) - Yet Another Watermark for Large Language Models [20.295405732813748]
Existing watermarking methods for large language models (LLMs) embed watermark by adjusting the token sampling prediction or post-processing.<n>We present a new watermarking framework for LLMs, where the watermark is embedded into the LLM by manipulating the internal parameters of the LLM.<n>The proposed method entangles the watermark with the intrinsic parameters of the LLM, which better balances the robustness and imperceptibility of the watermark.
arXiv Detail & Related papers (2025-09-16T02:04:55Z) - SAEMark: Multi-bit LLM Watermarking with Inference-Time Scaling [24.603169307967338]
SAEMark is a general framework for post-hoc multi-bit watermarking.<n>It embeds personalized messages solely via inference-time, feature-based rejection sampling.<n>We show SAEMark's consistent performance, with 99.7% F1 on English and strong multi-bit detection accuracy.
arXiv Detail & Related papers (2025-08-11T17:33:18Z) - StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models [4.76514657698929]
StealthInk is a stealthy multi-bit watermarking scheme for large language models (LLMs)<n>It preserves the original text distribution while enabling the embedding of provenance data.<n>We derive a lower bound on the number of tokens necessary for watermark detection at a fixed equal error rate.
arXiv Detail & Related papers (2025-06-05T18:37:38Z) - Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation [58.85645136534301]
Existing watermarking schemes for sampled text often face trade-offs between maintaining text quality and ensuring robust detection against various attacks.
We propose a novel watermarking scheme that improves both detectability and text quality by introducing a cumulative watermark entropy threshold.
arXiv Detail & Related papers (2025-04-16T14:16:38Z) - Improved Unbiased Watermark for Large Language Models [59.00698153097887]
We introduce MCmark, a family of unbiased, Multi-Channel-based watermarks.
MCmark preserves the original distribution of the language model.
It offers significant improvements in detectability and robustness over existing unbiased watermarks.
arXiv Detail & Related papers (2025-02-16T21:02:36Z) - GaussMark: A Practical Approach for Structural Watermarking of Language Models [61.84270985214254]
GaussMark is a simple, efficient, and relatively robust scheme for watermarking large language models.
We show that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations.
arXiv Detail & Related papers (2025-01-17T22:30:08Z) - FreqMark: Frequency-Based Watermark for Sentence-Level Detection of LLM-Generated Text [31.600659350609476]
FreqMark embeds frequency-based watermarks in Large Language Models (LLMs) generated text during token sampling process.
Method leverages periodic signals to guide token selection, creating a watermark that can be detected with Short-Time Fourier Transform (STFT) analysis.
Experiments demonstrate robustness and precision of FreqMark, showing strong detection capabilities against various attack scenarios.
arXiv Detail & Related papers (2024-10-09T05:01:48Z) - Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs)
This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text.
Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z) - MarkLLM: An Open-Source Toolkit for LLM Watermarking [80.00466284110269]
MarkLLM is an open-source toolkit for implementing LLM watermarking algorithms.
For evaluation, MarkLLM offers a comprehensive suite of 12 tools spanning three perspectives, along with two types of automated evaluation pipelines.
arXiv Detail & Related papers (2024-05-16T12:40:01Z) - Topic-Based Watermarks for Large Language Models [46.71493672772134]
We propose a lightweight, topic-guided watermarking scheme for Large Language Model (LLM) output.
Our method achieves comparable perplexity to industry-leading systems, including Google's SynthID-Text.
arXiv Detail & Related papers (2024-04-02T17:49:40Z) - Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [31.062753031312006]
Large language models generate high-quality responses with potential misinformation.
Watermarking is pivotal in this context, which involves embedding hidden markers in texts.
We introduce a novel multi-objective optimization (MOO) approach for watermarking.
Our method simultaneously achieves detectability and semantic integrity.
arXiv Detail & Related papers (2024-02-28T05:43:22Z) - WatME: Towards Lossless Watermarking Through Lexical Redundancy [58.61972059246715]
This study assesses the impact of watermarking on different capabilities of large language models (LLMs) from a cognitive science lens.
We introduce Watermarking with Mutual Exclusion (WatME) to seamlessly integrate watermarks.
arXiv Detail & Related papers (2023-11-16T11:58:31Z) - A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks.
There are concerns that LLMs are possible to be used improperly or even illegally.
We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z) - A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models.
The watermark can be embedded with negligible impact on text quality.
It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.