Watermarking LLMs with Weight Quantization
- URL: http://arxiv.org/abs/2310.11237v1
- Date: Tue, 17 Oct 2023 13:06:59 GMT
- Title: Watermarking LLMs with Weight Quantization
- Authors: Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, Xipeng Qiu
- Abstract summary: This paper proposes a novel watermarking strategy that plants watermarks in the quantization process of large language models.
We successfully plant the watermark into open-source large language model weights including GPT-Neo and LLaMA.
- Score: 61.63899115699713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abuse of large language models reveals high risks as large language models
are being deployed at an astonishing speed. It is important to protect the
model weights to avoid malicious usage that violates licenses of open-source
large language models. This paper proposes a novel watermarking strategy that
plants watermarks in the quantization process of large language models without
pre-defined triggers during inference. The watermark works when the model is
used in the fp32 mode and remains hidden when the model is quantized to int8,
in this way, the users can only inference the model without further supervised
fine-tuning of the model. We successfully plant the watermark into open-source
large language model weights including GPT-Neo and LLaMA. We hope our proposed
method can provide a potential direction for protecting model weights in the
era of large language model applications.
Related papers
- WAPITI: A Watermark for Finetuned Open-Source LLMs [42.1087852764299]
WAPITI is a new method that transfers watermarking from base models to fine-tuned models through parameter integration.
We show that our method can successfully inject watermarks and is highly compatible with fine-tuned models.
arXiv Detail & Related papers (2024-10-09T01:41:14Z) - AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA [67.68750063537482]
Diffusion models have achieved remarkable success in generating high-quality images.
Recent works aim to let SD models output watermarked content for post-hoc forensics.
We propose textttmethod as the first implementation under this scenario.
arXiv Detail & Related papers (2024-05-18T01:25:47Z) - ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks.
adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation.
Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z) - Unbiased Watermark for Large Language Models [67.43415395591221]
This study examines how significantly watermarks impact the quality of model-generated outputs.
It is possible to integrate watermarks without affecting the output probability distribution.
The presence of watermarks does not compromise the performance of the model in downstream tasks.
arXiv Detail & Related papers (2023-09-22T12:46:38Z) - A Watermark for Large Language Models [84.95327142027183]
We propose a watermarking framework for proprietary language models.
The watermark can be embedded with negligible impact on text quality.
It can be detected using an efficient open-source algorithm without access to the language model API or parameters.
arXiv Detail & Related papers (2023-01-24T18:52:59Z) - Removing Backdoor-Based Watermarks in Neural Networks with Limited Data [26.050649487499626]
Trading deep models is highly demanded and lucrative nowadays.
naive trading schemes typically involve potential risks related to copyright and trustworthiness issues.
We propose a novel backdoor-based watermark removal framework using limited data, dubbed WILD.
arXiv Detail & Related papers (2020-08-02T06:25:26Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.