Related papers: k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

URL: http://arxiv.org/abs/2402.11399v2
Date: Sat, 8 Jun 2024 04:24:27 GMT
Title: k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
Authors: Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He,
Abstract summary: k-SemStamp is a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.
Score: 23.46627236325298
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.

Related papers

An Ensemble Framework for Unbiased Language Model Watermarking [60.99969104552168]
We propose ENS, a novel ensemble framework that enhances the detectability and robustness of unbiased watermarks.<n>ENS sequentially composes multiple independent watermark instances, each governed by a distinct key, to amplify the watermark signal.<n> Empirical evaluations show that ENS substantially reduces the number of tokens needed for reliable detection and increases resistance to smoothing and paraphrasing attacks.
arXiv Detail & Related papers (2025-09-28T19:37:44Z)
CATMark: A Context-Aware Thresholding Framework for Robust Cross-Task Watermarking in Large Language Models [37.67547464259489]
We propose a novel framework that dynamically adjusts watermarking intensity based on real-time semantic context.<n>$myalgo$ partitions text generation into semantic states using logits clustering, establishing context-aware entropy thresholds.<n>Experiments show $myalgo$ improves text quality in cross-tasks without sacrificing detection accuracy.
arXiv Detail & Related papers (2025-09-27T03:43:52Z)
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation [58.85645136534301]
Existing watermarking schemes for sampled text often face trade-offs between maintaining text quality and ensuring robust detection against various attacks. We propose a novel watermarking scheme that improves both detectability and text quality by introducing a cumulative watermark entropy threshold.
arXiv Detail & Related papers (2025-04-16T14:16:38Z)
Topic-Based Watermarks for Large Language Models [46.71493672772134]
We propose a lightweight, topic-guided watermarking scheme for Large Language Model (LLM) output. Our method achieves comparable perplexity to industry-leading systems, including Google's SynthID-Text.
arXiv Detail & Related papers (2024-04-02T17:49:40Z)
Duwak: Dual Watermarks in Large Language Models [49.00264962860555]
We propose, Duwak, to enhance the efficiency and quality of watermarking by embedding dual secret patterns in both token probability distribution and sampling schemes. We evaluate Duwak extensively on Llama2, against four state-of-the-art watermarking techniques and combinations of them.
arXiv Detail & Related papers (2024-03-12T16:25:38Z)
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [31.062753031312006]
Large language models generate high-quality responses with potential misinformation. Watermarking is pivotal in this context, which involves embedding hidden markers in texts. We introduce a novel multi-objective optimization (MOO) approach for watermarking. Our method simultaneously achieves detectability and semantic integrity.
arXiv Detail & Related papers (2024-02-28T05:43:22Z)
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks. There are concerns that LLMs are possible to be used improperly or even illegally. We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z)
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [72.10931780019297]
Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design. We propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH) Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on both common and bigram paraphrase attacks, but also is better at preserving the quality of generation.
arXiv Detail & Related papers (2023-10-06T03:33:42Z)
An Unforgeable Publicly Verifiable Watermark for Large Language Models [84.2805275589553]
Current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection. We propose an unforgeable publicly verifiable watermark algorithm named UPV that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages.
arXiv Detail & Related papers (2023-07-30T13:43:27Z)
Who Wrote this Code? Watermarking for Code Generation [53.24895162874416]
We propose Selective WatErmarking via Entropy Thresholding (SWEET) to detect machine-generated text. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines.
arXiv Detail & Related papers (2023-05-24T11:49:52Z)
Adaptive Shrink-Mask for Text Detection [91.34459257409104]
Existing real-time text detectors reconstruct text contours by shrink-masks directly. The dependence on predicted shrink-masks leads to unstable detection results. Super-pixel Window (SPW) is designed to supervise the network.
arXiv Detail & Related papers (2021-11-18T07:38:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.