Related papers: Watermarking LLMs with Weight Quantization

Watermarking LLMs with Weight Quantization

URL: http://arxiv.org/abs/2310.11237v1
Date: Tue, 17 Oct 2023 13:06:59 GMT
Title: Watermarking LLMs with Weight Quantization
Authors: Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, Xipeng Qiu
Abstract summary: This paper proposes a novel watermarking strategy that plants watermarks in the quantization process of large language models. We successfully plant the watermark into open-source large language model weights including GPT-Neo and LLaMA.
Score: 61.63899115699713
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Abuse of large language models reveals high risks as large language models are being deployed at an astonishing speed. It is important to protect the model weights to avoid malicious usage that violates licenses of open-source large language models. This paper proposes a novel watermarking strategy that plants watermarks in the quantization process of large language models without pre-defined triggers during inference. The watermark works when the model is used in the fp32 mode and remains hidden when the model is quantized to int8, in this way, the users can only inference the model without further supervised fine-tuning of the model. We successfully plant the watermark into open-source large language model weights including GPT-Neo and LLaMA. We hope our proposed method can provide a potential direction for protecting model weights in the era of large language model applications.

Related papers

Hot-Swap MarkBoard: An Efficient Black-box Watermarking Approach for Large-scale Model Distribution [14.60627694687767]
We propose Hot-Swap MarkBoard, an efficient watermarking method.<n>It encodes user-specific $n$-bit binary signatures by independently embedding multiple watermarks.<n>The method supports black-box verification and is compatible with various model architectures.
arXiv Detail & Related papers (2025-07-28T09:14:21Z)
SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a framework designed to embed resilient watermarks into T2I diffusion models. It guides the model to disentangle the watermark information from the semantic concepts it learns. Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z)
WAPITI: A Watermark for Finetuned Open-Source LLMs [42.1087852764299]
WAPITI is a new method that transfers watermarking from base models to fine-tuned models through parameter integration. We show that our method can successfully inject watermarks and is highly compatible with fine-tuned models.
arXiv Detail & Related papers (2024-10-09T01:41:14Z)
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA [67.68750063537482]
Diffusion models have achieved remarkable success in generating high-quality images. Recent works aim to let SD models output watermarked content for post-hoc forensics. We propose textttmethod as the first implementation under this scenario.
arXiv Detail & Related papers (2024-05-18T01:25:47Z)
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks. adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation. Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z)
Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs. It is possible to integrate watermarks without affecting the output probability distribution. The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z)
A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality. It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z)
Removing Backdoor-Based Watermarks in Neural Networks with Limited Data [26.050649487499626]
Trading deep models is highly demanded and lucrative nowadays. naive trading schemes typically involve potential risks related to copyright and trustworthiness issues. We propose a novel backdoor-based watermark removal framework using limited data, dubbed WILD.
arXiv Detail & Related papers (2020-08-02T06:25:26Z)
Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem. We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.