Robust Multi-bit Natural Language Watermarking through Invariant
Features
- URL: http://arxiv.org/abs/2305.01904v2
- Date: Fri, 9 Jun 2023 07:17:14 GMT
- Title: Robust Multi-bit Natural Language Watermarking through Invariant
Features
- Authors: KiYoon Yoo, Wonhyuk Ahn, Jiho Jang, Nojun Kwak
- Abstract summary: Original natural language contents are susceptible to illegal piracy and potential misuse.
To effectively combat piracy and protect copyrights, a multi-bit watermarking framework should be able to embed adequate bits of information.
In this work, we explore ways to advance both payload and robustness by following a well-known proposition from image watermarking.
- Score: 28.4935678626116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have witnessed a proliferation of valuable original natural
language contents found in subscription-based media outlets, web novel
platforms, and outputs of large language models. However, these contents are
susceptible to illegal piracy and potential misuse without proper security
measures. This calls for a secure watermarking system to guarantee copyright
protection through leakage tracing or ownership identification. To effectively
combat piracy and protect copyrights, a multi-bit watermarking framework should
be able to embed adequate bits of information and extract the watermarks in a
robust manner despite possible corruption. In this work, we explore ways to
advance both payload and robustness by following a well-known proposition from
image watermarking and identify features in natural language that are invariant
to minor corruption. Through a systematic analysis of the possible sources of
errors, we further propose a corruption-resistant infill model. Our full method
improves upon the previous work on robustness by +16.8% point on average on
four datasets, three corruption types, and two corruption ratios. Code
available at https://github.com/bangawayoo/nlp-watermarking.
Related papers
- On the Coexistence and Ensembling of Watermarks [93.15379331904602]
We find that various open-source watermarks can coexist with only minor impacts on image quality and decoding robustness.
We show how ensembling can increase the overall message capacity and enable new trade-offs between capacity, accuracy, robustness and image quality, without needing to retrain the base models.
arXiv Detail & Related papers (2025-01-29T00:37:06Z) - Let Watermarks Speak: A Robust and Unforgeable Watermark for Language Models [0.0]
We propose an undetectable, robust, single-bit watermarking scheme.
It has a comparable robustness to the most advanced zero-bit watermarking schemes.
arXiv Detail & Related papers (2024-12-27T11:58:05Z) - RoboSignature: Robust Signature and Watermarking on Network Attacks [0.5461938536945723]
We present a novel adversarial fine-tuning attack that disrupts the model's ability to embed the intended watermark.
Our findings emphasize the importance of anticipating and defending against potential vulnerabilities in generative systems.
arXiv Detail & Related papers (2024-12-22T04:36:27Z) - Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns.
Watermarking AI-generated content is a key technology to address these concerns.
We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z) - Watermarking Language Models with Error Correcting Codes [39.77377710480125]
We propose a watermarking framework that encodes statistical signals through an error correcting code.
Our method, termed robust binary code (RBC) watermark, introduces no distortion compared to the original probability distribution.
Our empirical findings suggest our watermark is fast, powerful, and robust, comparing favorably to the state-of-the-art.
arXiv Detail & Related papers (2024-06-12T05:13:09Z) - Evaluating Durability: Benchmark Insights into Multimodal Watermarking [36.12198778931536]
We study robustness of watermarked content generated by image and text generation models against common real-world image corruptions and text perturbations.
Our results could pave the way for the development of more robust watermarking techniques in the future.
arXiv Detail & Related papers (2024-06-06T03:57:08Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models [65.40460716619772]
Our research focuses on the importance of a textbfDistribution-textbfPreserving (DiP) watermark.
Contrary to the current strategies, our proposed DiPmark simultaneously preserves the original token distribution during watermarking.
It is detectable without access to the language model API and prompts (accessible), and is provably robust to moderate changes of tokens.
arXiv Detail & Related papers (2023-10-11T17:57:35Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Advancing Beyond Identification: Multi-bit Watermark for Large Language Models [31.066140913513035]
We show the viability of tackling misuses of large language models beyond the identification of machine-generated text.
We propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation.
arXiv Detail & Related papers (2023-08-01T01:27:40Z) - On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document.
We find that watermarks remain detectable even after human and machine paraphrasing.
We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.