Related papers: Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

URL: http://arxiv.org/abs/2308.00221v3
Date: Wed, 20 Mar 2024 01:04:11 GMT
Title: Advancing Beyond Identification: Multi-bit Watermark for Large Language Models
Authors: KiYoon Yoo, Wonhyuk Ahn, Nojun Kwak,
Abstract summary: We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. We propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation.
Score: 31.066140913513035
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation. Through allocating tokens onto different parts of the messages, we embed longer messages in high corruption settings without added latency. By independently embedding sub-units of messages, the proposed method outperforms the existing works in terms of robustness and latency. Leveraging the benefits of zero-bit watermarking, our method enables robust extraction of the watermark without any model access, embedding and extraction of long messages ($\geq$ 32-bit) without finetuning, and maintaining text quality, while allowing zero-bit detection all at the same time. Code is released here: https://github.com/bangawayoo/mb-lm-watermarking

Related papers

StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models [4.76514657698929]
StealthInk is a stealthy multi-bit watermarking scheme for large language models (LLMs)<n>It preserves the original text distribution while enabling the embedding of provenance data.<n>We derive a lower bound on the number of tokens necessary for watermark detection at a fixed equal error rate.
arXiv Detail & Related papers (2025-06-05T18:37:38Z)
Speech Watermarking with Discrete Intermediate Representations [45.892635912641836]
We propose a novel speech watermarking framework that injects watermarks into the discrete intermediate representations of speech. DiscreteWM achieves state-of-the-art performance in robustness and imperceptibility, simultaneously. Our flexible frame-wise approach can serve as an efficient solution for both voice cloning detection and information hiding.
arXiv Detail & Related papers (2024-12-18T14:57:06Z)
Watermarking Language Models for Many Adaptive Users [47.90822587139056]
We study watermarking schemes for language models with provable guarantees. We introduce multi-user watermarks, which allow tracing model-generated text to individual users. We prove that the undetectable zero-bit scheme of Christ, Gunn, and Zamir (2024) is adaptively robust.
arXiv Detail & Related papers (2024-05-17T22:15:30Z)
Multi-Bit Distortion-Free Watermarking for Large Language Models [4.7381853007029475]
We extend an existing zero-bit distortion-free watermarking method by embedding multiple bits of meta-information as part of the watermark. We also develop a computationally efficient decoder that extracts the embedded information from the watermark with low bit error rate.
arXiv Detail & Related papers (2024-02-26T14:01:34Z)
Provably Robust Multi-bit Watermarking for AI-generated Text [37.21416140194606]
Large Language Models (LLMs) have demonstrated remarkable capabilities of generating texts resembling human language. They can be misused by criminals to create deceptive content, such as fake news and phishing emails. Watermarking is a key technique to address these concerns, which embeds a message into a text.
arXiv Detail & Related papers (2024-01-30T08:46:48Z)
Mark My Words: Analyzing and Evaluating Language Model Watermarks [8.025719866615333]
This work focuses on output watermarking techniques, as opposed to image or model watermarks. We focus on three main metrics: quality, size (i.e., the number of tokens needed to detect a watermark), and tamper resistance.
arXiv Detail & Related papers (2023-12-01T01:22:46Z)
An Unforgeable Publicly Verifiable Watermark for Large Language Models [84.2805275589553]
Current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection. We propose an unforgeable publicly verifiable watermark algorithm named UPV that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages.
arXiv Detail & Related papers (2023-07-30T13:43:27Z)
On the Reliability of Watermarks for Large Language Models [95.87476978352659]
We study the robustness of watermarked text after it is re-written by humans, paraphrased by a non-watermarked LLM, or mixed into a longer hand-written document. We find that watermarks remain detectable even after human and machine paraphrasing. We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document.
arXiv Detail & Related papers (2023-06-07T17:58:48Z)
Who Wrote this Code? Watermarking for Code Generation [53.24895162874416]
We propose Selective WatErmarking via Entropy Thresholding (SWEET) to detect machine-generated text. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines.
arXiv Detail & Related papers (2023-05-24T11:49:52Z)
Watermarking Text Generated by Black-Box Language Models [103.52541557216766]
A watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation. A detection algorithm aware of the list can identify the watermarked text. We develop a watermarking framework for black-box language model usage scenarios.
arXiv Detail & Related papers (2023-05-14T07:37:33Z)
A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality. It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.