Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring
- URL: http://arxiv.org/abs/2311.09668v1
- Date: Thu, 16 Nov 2023 08:36:00 GMT
- Title: Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring
- Authors: Yuhang Li, Yihan Wang, Zhouxing Shi, Cho-Jui Hsieh
- Abstract summary: Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
- Score: 81.62249424226084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The strong general capabilities of Large Language Models (LLMs) bring
potential ethical risks if they are unrestrictedly accessible to malicious
users. Token-level watermarking inserts watermarks in the generated texts by
altering the token probability distributions with a private random number
generator seeded by its prefix tokens. However, this watermarking algorithm
alters the logits during generation, which can lead to a downgraded text
quality if it chooses to promote tokens that are less relevant given the input.
In this work, we propose to improve the quality of texts generated by a
watermarked language model by Watermarking with Importance Scoring (WIS). At
each generation step, we estimate the importance of the token to generate, and
prevent it from being impacted by watermarking if it is important for the
semantic correctness of the output. We further propose three methods to predict
importance scoring, including a perturbation-based method and two model-based
methods. Empirical experiments show that our method can generate texts with
better quality with comparable level of detection rate.
Related papers
- Topic-Based Watermarks for LLM-Generated Text [46.71493672772134]
This paper proposes a novel topic-based watermarking algorithm for large language models (LLMs)
By using topic-specific token biases, we embed a topic-sensitive watermarking into the generated text.
We demonstrate that our proposed watermarking scheme classifies various watermarked text topics with 99.99% confidence.
arXiv Detail & Related papers (2024-04-02T17:49:40Z) - Duwak: Dual Watermarks in Large Language Models [49.00264962860555]
We propose, Duwak, to enhance the efficiency and quality of watermarking by embedding dual secret patterns in both token probability distribution and sampling schemes.
We evaluate Duwak extensively on Llama2, against four state-of-the-art watermarking techniques and combinations of them.
arXiv Detail & Related papers (2024-03-12T16:25:38Z) - Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [31.062753031312006]
Large language models generate high-quality responses with potential misinformation.
Watermarking is pivotal in this context, which involves embedding hidden markers in texts.
We introduce a novel multi-objective optimization (MOO) approach for watermarking.
Our method simultaneously achieves detectability and semantic integrity.
arXiv Detail & Related papers (2024-02-28T05:43:22Z) - Adaptive Text Watermark for Large Language Models [8.100123266517299]
It is challenging to generate high-quality watermarked text while maintaining strong security, robustness, and the ability to detect watermarks without prior knowledge of the prompt or model.
This paper proposes an adaptive watermarking strategy to address this problem.
arXiv Detail & Related papers (2024-01-25T03:57:12Z) - A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models [65.40460716619772]
Our research focuses on the importance of a textbfDistribution-textbfPreserving (DiP) watermark.
Contrary to the current strategies, our proposed DiPmark simultaneously preserves the original token distribution during watermarking.
It is detectable without access to the language model API and prompts (accessible), and is provably robust to moderate changes of tokens.
arXiv Detail & Related papers (2023-10-11T17:57:35Z) - Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism.
Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs.
We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z) - A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models.
The watermark can be embedded with negligible impact on text quality.
It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.