Necessary and Sufficient Watermark for Large Language Models
- URL: http://arxiv.org/abs/2310.00833v1
- Date: Mon, 2 Oct 2023 00:48:51 GMT
- Title: Necessary and Sufficient Watermark for Large Language Models
- Authors: Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada
- Abstract summary: We propose the Necessary and Sufficient Watermark (NS-Watermark) for inserting watermarks into generated texts without degrading text quality.
We demonstrate that the NS-Watermark can generate more natural texts than existing watermarking methods.
Especially in machine translation tasks, the NS-Watermark can outperform the existing watermarking method by up to 30 BLEU scores.
- Score: 31.933103173481964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, large language models (LLMs) have achieved remarkable
performances in various NLP tasks. They can generate texts that are
indistinguishable from those written by humans. Such remarkable performance of
LLMs increases their risk of being used for malicious purposes, such as
generating fake news articles. Therefore, it is necessary to develop methods
for distinguishing texts written by LLMs from those written by humans.
Watermarking is one of the most powerful methods for achieving this. Although
existing watermarking methods have successfully detected texts generated by
LLMs, they significantly degrade the quality of the generated texts. In this
study, we propose the Necessary and Sufficient Watermark (NS-Watermark) for
inserting watermarks into generated texts without degrading the text quality.
More specifically, we derive minimum constraints required to be imposed on the
generated texts to distinguish whether LLMs or humans write the texts. Then, we
formulate the NS-Watermark as a constrained optimization problem and propose an
efficient algorithm to solve it. Through the experiments, we demonstrate that
the NS-Watermark can generate more natural texts than existing watermarking
methods and distinguish more accurately between texts written by LLMs and those
written by humans. Especially in machine translation tasks, the NS-Watermark
can outperform the existing watermarking method by up to 30 BLEU scores.
Related papers
- Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality [27.592486717044455]
We present a novel type of watermark, Sparse Watermark, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text.
Our experimental results demonstrate that the proposed watermarking scheme achieves high detectability while generating text that outperforms previous watermarking methods in quality across various tasks.
arXiv Detail & Related papers (2024-07-17T18:52:12Z) - Topic-Based Watermarks for LLM-Generated Text [46.71493672772134]
This paper proposes a novel topic-based watermarking algorithm for large language models (LLMs)
By using topic-specific token biases, we embed a topic-sensitive watermarking into the generated text.
We demonstrate that our proposed watermarking scheme classifies various watermarked text topics with 99.99% confidence.
arXiv Detail & Related papers (2024-04-02T17:49:40Z) - Provably Robust Multi-bit Watermarking for AI-generated Text [37.21416140194606]
Large Language Models (LLMs) have demonstrated remarkable capabilities of generating texts resembling human language.
They can be misused by criminals to create deceptive content, such as fake news and phishing emails.
Watermarking is a key technique to address these concerns, which embeds a message into a text.
arXiv Detail & Related papers (2024-01-30T08:46:48Z) - A Survey of Text Watermarking in the Era of Large Language Models [91.36874607025909]
Text watermarking algorithms are crucial for protecting the copyright of textual content.
Recent advancements in large language models (LLMs) have revolutionized these techniques.
This paper conducts a comprehensive survey of the current state of text watermarking technology.
arXiv Detail & Related papers (2023-12-13T06:11:42Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - Turning Your Strength into Watermark: Watermarking Large Language Model via Knowledge Injection [66.26348985345776]
We propose a novel watermarking method for large language models (LLMs) based on knowledge injection.
In the watermark embedding stage, we first embed the watermarks into the selected knowledge to obtain the watermarked knowledge.
In the watermark extraction stage, questions related to the watermarked knowledge are designed, for querying the suspect LLM.
Experiments show that the watermark extraction success rate is close to 100% and demonstrate the effectiveness, fidelity, stealthiness, and robustness of our proposed method.
arXiv Detail & Related papers (2023-11-16T03:22:53Z) - Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism.
Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs.
We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z) - Provable Robust Watermarking for AI-Generated Text [41.5510809722375]
We propose a robust and high-quality watermark method, Unigram-Watermark.
We prove that our watermark method enjoys guaranteed generation quality, correctness in watermark detection, and is robust against text editing and paraphrasing.
arXiv Detail & Related papers (2023-06-30T07:24:32Z) - On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document.
We find that watermarks remain detectable even after human and machine paraphrasing.
We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.