Related papers: Robust Multi-bit Natural Language Watermarking through Invariant Features

Robust Multi-bit Natural Language Watermarking through Invariant Features

URL: http://arxiv.org/abs/2305.01904v2
Date: Fri, 9 Jun 2023 07:17:14 GMT
Title: Robust Multi-bit Natural Language Watermarking through Invariant Features
Authors: KiYoon Yoo, Wonhyuk Ahn, Jiho Jang, Nojun Kwak
Abstract summary: Original natural language contents are susceptible to illegal piracy and potential misuse. To effectively combat piracy and protect copyrights, a multi-bit watermarking framework should be able to embed adequate bits of information. In this work, we explore ways to advance both payload and robustness by following a well-known proposition from image watermarking.
Score: 28.4935678626116
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent years have witnessed a proliferation of valuable original natural language contents found in subscription-based media outlets, web novel platforms, and outputs of large language models. However, these contents are susceptible to illegal piracy and potential misuse without proper security measures. This calls for a secure watermarking system to guarantee copyright protection through leakage tracing or ownership identification. To effectively combat piracy and protect copyrights, a multi-bit watermarking framework should be able to embed adequate bits of information and extract the watermarks in a robust manner despite possible corruption. In this work, we explore ways to advance both payload and robustness by following a well-known proposition from image watermarking and identify features in natural language that are invariant to minor corruption. Through a systematic analysis of the possible sources of errors, we further propose a corruption-resistant infill model. Our full method improves upon the previous work on robustness by +16.8% point on average on four datasets, three corruption types, and two corruption ratios. Code available at https://github.com/bangawayoo/nlp-watermarking.

Related papers

FontGuard: A Robust Font Watermarking Approach Leveraging Deep Font Knowledge [14.545769739571291]
We introduce FontGuard, a novel font watermarking model that harnesses the capabilities of font models and language-guided contrastive learning. FontGuard modifies fonts by altering hidden style features, resulting in better font quality upon watermark embedding. In the decoder, we employ an image-text contrastive learning to reconstruct the embedded bits, which can achieve desirable robustness against various real-world transmission distortions.
arXiv Detail & Related papers (2025-04-04T02:39:33Z)
On the Coexistence and Ensembling of Watermarks [93.15379331904602]
We find that various open-source watermarks can coexist with only minor impacts on image quality and decoding robustness. We show how ensembling can increase the overall message capacity and enable new trade-offs between capacity, accuracy, robustness and image quality, without needing to retrain the base models.
arXiv Detail & Related papers (2025-01-29T00:37:06Z)
Let Watermarks Speak: A Robust and Unforgeable Watermark for Language Models [0.0]
We propose an undetectable, robust, single-bit watermarking scheme. It has a comparable robustness to the most advanced zero-bit watermarking schemes.
arXiv Detail & Related papers (2024-12-27T11:58:05Z)
RoboSignature: Robust Signature and Watermarking on Network Attacks [0.5461938536945723]
We present a novel adversarial fine-tuning attack that disrupts the model's ability to embed the intended watermark. Our findings emphasize the importance of anticipating and defending against potential vulnerabilities in generative systems.
arXiv Detail & Related papers (2024-12-22T04:36:27Z)
Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns. Watermarking AI-generated content is a key technology to address these concerns. We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z)
Watermarking Language Models with Error Correcting Codes [41.21656847672627]
We propose a watermarking framework that encodes statistical signals through an error correcting code. Our method, termed robust binary code (RBC) watermark, introduces no distortion compared to the original probability distribution. Our empirical findings suggest our watermark is fast, powerful, and robust, comparing favorably to the state-of-the-art.
arXiv Detail & Related papers (2024-06-12T05:13:09Z)
Evaluating Durability: Benchmark Insights into Multimodal Watermarking [36.12198778931536]
We study robustness of watermarked content generated by image and text generation models against common real-world image corruptions and text perturbations. Our results could pave the way for the development of more robust watermarking techniques in the future.
arXiv Detail & Related papers (2024-06-06T03:57:08Z)
Edit Distance Robust Watermarks for Language Models [29.69428894587431]
Motivated by the problem of detecting AI-generated text, we consider the problem of watermarking the output of language models with provable guarantees. We aim for watermarks which satisfy: (a) undetectability, a cryptographic notion introduced by Christ, Gunn & Zamir (2024) and (b) robustness to channels which introduce a constant fraction of adversarial insertions, substitutions, and deletions.
arXiv Detail & Related papers (2024-06-04T04:03:17Z)
Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions. This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality. We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z)
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models [65.40460716619772]
Our research focuses on the importance of a textbfDistribution-textbfPreserving (DiP) watermark. Contrary to the current strategies, our proposed DiPmark simultaneously preserves the original token distribution during watermarking. It is detectable without access to the language model API and prompts (accessible), and is provably robust to moderate changes of tokens.
arXiv Detail & Related papers (2023-10-11T17:57:35Z)
Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model. We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior. Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z)
Advancing Beyond Identification: Multi-bit Watermark for Large Language Models [31.066140913513035]
We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. We propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation.
arXiv Detail & Related papers (2023-08-01T01:27:40Z)
On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document. We find that watermarks remain detectable even after human and machine paraphrasing. We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
Watermarking Text Generated by Black-Box Language Models [103.52541557216766]
A watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation. A detection algorithm aware of the list can identify the watermarked text. We develop a watermarking framework for black-box language model usage scenarios.
arXiv Detail & Related papers (2023-05-14T07:37:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.