Improved Unbiased Watermark for Large Language Models
- URL: http://arxiv.org/abs/2502.11268v1
- Date: Sun, 16 Feb 2025 21:02:36 GMT
- Title: Improved Unbiased Watermark for Large Language Models
- Authors: Ruibo Chen, Yihan Wu, Junfeng Guo, Heng Huang,
- Abstract summary: We introduce MCmark, a family of unbiased, Multi-Channel-based watermarks.
MCmark preserves the original distribution of the language model.
It offers significant improvements in detectability and robustness over existing unbiased watermarks.
- Score: 59.00698153097887
- License:
- Abstract: As artificial intelligence surpasses human capabilities in text generation, the necessity to authenticate the origins of AI-generated content has become paramount. Unbiased watermarks offer a powerful solution by embedding statistical signals into language model-generated text without distorting the quality. In this paper, we introduce MCmark, a family of unbiased, Multi-Channel-based watermarks. MCmark works by partitioning the model's vocabulary into segments and promoting token probabilities within a selected segment based on a watermark key. We demonstrate that MCmark not only preserves the original distribution of the language model but also offers significant improvements in detectability and robustness over existing unbiased watermarks. Our experiments with widely-used language models demonstrate an improvement in detectability of over 10% using MCmark, compared to existing state-of-the-art unbiased watermarks. This advancement underscores MCmark's potential in enhancing the practical application of watermarking in AI-generated texts.
Related papers
- Watermarking Language Models with Error Correcting Codes [39.77377710480125]
We propose a watermarking framework that encodes statistical signals through an error correcting code.
Our method, termed robust binary code (RBC) watermark, introduces no distortion compared to the original probability distribution.
Our empirical findings suggest our watermark is fast, powerful, and robust, comparing favorably to the state-of-the-art.
arXiv Detail & Related papers (2024-06-12T05:13:09Z) - GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick [50.35069175236422]
Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty.
Decoding-based watermark, particularly the GumbelMax-trick-based watermark(GM watermark), is a standout solution for safeguarding machine-generated texts.
We propose a new type of GM watermark, the Logits-Addition watermark, and its three variants, specifically designed to enhance diversity.
arXiv Detail & Related papers (2024-02-20T12:05:47Z) - Mark My Words: Analyzing and Evaluating Language Model Watermarks [8.025719866615333]
This work focuses on output watermarking techniques, as opposed to image or model watermarks.
We focus on three main metrics: quality, size (i.e., the number of tokens needed to detect a watermark), and tamper resistance.
arXiv Detail & Related papers (2023-12-01T01:22:46Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models [65.40460716619772]
Our research focuses on the importance of a textbfDistribution-textbfPreserving (DiP) watermark.
Contrary to the current strategies, our proposed DiPmark simultaneously preserves the original token distribution during watermarking.
It is detectable without access to the language model API and prompts (accessible), and is provably robust to moderate changes of tokens.
arXiv Detail & Related papers (2023-10-11T17:57:35Z) - Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs.
It is possible to integrate watermarks without affecting the output probability distribution.
The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z) - A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models.
The watermark can be embedded with negligible impact on text quality.
It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.