Related papers: PersonaMark: Personalized LLM watermarking for model protection and user attribution

PersonaMark: Personalized LLM watermarking for model protection and user attribution

URL: http://arxiv.org/abs/2409.09739v1
Date: Sun, 15 Sep 2024 14:10:01 GMT
Title: PersonaMark: Personalized LLM watermarking for model protection and user attribution
Authors: Yuehan Zhang, Peizhuo Lv, Yinpeng Liu, Yongqiang Ma, Wei Lu, Xiaofeng Wang, Xiaozhong Liu, Jiawei Liu,
Abstract summary: Text watermarking is emerging as a promising solution to AI-generated text detection and model protection issues. We propose a novel text watermarking method PersonaMark that utilizes sentence structure as the hidden medium for the watermark information. Our method maintains performance with minimal perturbation to the model's behavior, allows for unbiased insertion of watermark information, and exhibits strong watermark recognition capabilities.
Score: 20.2735173280022
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid development of LLMs brings both convenience and potential threats. As costumed and private LLMs are widely applied, model copyright protection has become important. Text watermarking is emerging as a promising solution to AI-generated text detection and model protection issues. However, current text watermarks have largely ignored the critical need for injecting different watermarks for different users, which could help attribute the watermark to a specific individual. In this paper, we explore the personalized text watermarking scheme for LLM copyright protection and other scenarios, ensuring accountability and traceability in content generation. Specifically, we propose a novel text watermarking method PersonaMark that utilizes sentence structure as the hidden medium for the watermark information and optimizes the sentence-level generation algorithm to minimize disruption to the model's natural generation process. By employing a personalized hashing function to inject unique watermark signals for different users, personalized watermarked text can be obtained. Since our approach performs on sentence level instead of token probability, the text quality is highly preserved. The injection process of unique watermark signals for different users is time-efficient for a large number of users with the designed multi-user hashing function. As far as we know, we achieved personalized text watermarking for the first time through this. We conduct an extensive evaluation of four different LLMs in terms of perplexity, sentiment polarity, alignment, readability, etc. The results demonstrate that our method maintains performance with minimal perturbation to the model's behavior, allows for unbiased insertion of watermark information, and exhibits strong watermark recognition capabilities.

Related papers

StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models [4.76514657698929]
StealthInk is a stealthy multi-bit watermarking scheme for large language models (LLMs)<n>It preserves the original text distribution while enabling the embedding of provenance data.<n>We derive a lower bound on the number of tokens necessary for watermark detection at a fixed equal error rate.
arXiv Detail & Related papers (2025-06-05T18:37:38Z)
GaussMark: A Practical Approach for Structural Watermarking of Language Models [61.84270985214254]
GaussMark is a simple, efficient, and relatively robust scheme for watermarking large language models. We show that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations.
arXiv Detail & Related papers (2025-01-17T22:30:08Z)
Watermarking Large Language Models and the Generated Content: Opportunities and Challenges [18.01886375229288]
generative large language models (LLMs) have raised concerns about intellectual property rights violations and the spread of machine-generated misinformation. Watermarking serves as a promising approch to establish ownership, prevent unauthorized use, and trace the origins of LLM-generated content. This paper summarizes and shares the challenges and opportunities we found when watermarking LLMs.
arXiv Detail & Related papers (2024-10-24T18:55:33Z)
Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs) This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text. Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z)
Can Watermarked LLMs be Identified by Users via Crafted Prompts? [55.460327393792156]
This work is the first to investigate the imperceptibility of watermarked Large Language Models (LLMs) We design an identification algorithm called Water-Probe that detects watermarks through well-designed prompts. Experiments show that almost all mainstream watermarking algorithms are easily identified with our well-designed prompts.
arXiv Detail & Related papers (2024-10-04T06:01:27Z)
Less is More: Sparse Watermarking in LLMs with Enhanced Text Quality [27.592486717044455]
We present a novel type of watermark, Sparse Watermark, which aims to mitigate this trade-off by applying watermarks to a small subset of generated tokens distributed across the text. Our experimental results demonstrate that the proposed watermarking scheme achieves high detectability while generating text that outperforms previous watermarking methods in quality across various tasks.
arXiv Detail & Related papers (2024-07-17T18:52:12Z)
Topic-Based Watermarks for LLM-Generated Text [46.71493672772134]
This paper proposes a novel topic-based watermarking algorithm for large language models (LLMs) By using topic-specific token biases, we embed a topic-sensitive watermarking into the generated text. We demonstrate that our proposed watermarking scheme classifies various watermarked text topics with 99.99% confidence.
arXiv Detail & Related papers (2024-04-02T17:49:40Z)
Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning [45.09125828947013]
The proposed approach effectively injects specific watermarking information into the customized model during fine-tuning. We evaluate the proposed "Double-I watermark" under various fine-tuning methods, demonstrating its harmlessness, robustness, uniqueness, imperceptibility, and validity through both quantitative and qualitative analyses.
arXiv Detail & Related papers (2024-02-22T04:55:14Z)
Adaptive Text Watermark for Large Language Models [8.100123266517299]
It is challenging to generate high-quality watermarked text while maintaining strong security, robustness, and the ability to detect watermarks without prior knowledge of the prompt or model. This paper proposes an adaptive watermarking strategy to address this problem.
arXiv Detail & Related papers (2024-01-25T03:57:12Z)
On the Learnability of Watermarks for Language Models [80.97358663708592]
We ask whether language models can directly learn to generate watermarked text. We propose watermark distillation, which trains a student model to behave like a teacher model. We find that models can learn to generate watermarked text with high detectability.
arXiv Detail & Related papers (2023-12-07T17:41:44Z)
Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions. This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality. We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z)
Turning Your Strength into Watermark: Watermarking Large Language Model via Knowledge Injection [66.26348985345776]
We propose a novel watermarking method for large language models (LLMs) based on knowledge injection. In the watermark embedding stage, we first embed the watermarks into the selected knowledge to obtain the watermarked knowledge. In the watermark extraction stage, questions related to the watermarked knowledge are designed, for querying the suspect LLM. Experiments show that the watermark extraction success rate is close to 100% and demonstrate the effectiveness, fidelity, stealthiness, and robustness of our proposed method.
arXiv Detail & Related papers (2023-11-16T03:22:53Z)
FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models [64.89896692649589]
We propose FT-Shield, a watermarking solution tailored for the fine-tuning of text-to-image diffusion models. FT-Shield addresses copyright protection challenges by designing new watermark generation and detection strategies.
arXiv Detail & Related papers (2023-10-03T19:50:08Z)
Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs. It is possible to integrate watermarks without affecting the output probability distribution. The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z)
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism. Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs. We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z)
Provable Robust Watermarking for AI-Generated Text [41.5510809722375]
We propose a robust and high-quality watermark method, Unigram-Watermark. We prove that our watermark method enjoys guaranteed generation quality, correctness in watermark detection, and is robust against text editing and paraphrasing.
arXiv Detail & Related papers (2023-06-30T07:24:32Z)
On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document. We find that watermarks remain detectable even after human and machine paraphrasing. We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality. It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.