Are You Copying My Model? Protecting the Copyright of Large Language
Models for EaaS via Backdoor Watermark
- URL: http://arxiv.org/abs/2305.10036v3
- Date: Fri, 2 Jun 2023 06:56:29 GMT
- Title: Are You Copying My Model? Protecting the Copyright of Large Language
Models for EaaS via Backdoor Watermark
- Authors: Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Zhu, Lingjuan
Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie
- Abstract summary: Companies have begun to offer Embedding as a Service (E) based on large language models (LLMs)
E is vulnerable to model extraction attacks, which can cause significant losses for the owners of LLMs.
We propose an Embedding Watermark method called EmbMarker that implants backdoors on embeddings.
- Score: 58.60940048748815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated powerful capabilities in both
text understanding and generation. Companies have begun to offer Embedding as a
Service (EaaS) based on these LLMs, which can benefit various natural language
processing (NLP) tasks for customers. However, previous studies have shown that
EaaS is vulnerable to model extraction attacks, which can cause significant
losses for the owners of LLMs, as training these models is extremely expensive.
To protect the copyright of LLMs for EaaS, we propose an Embedding Watermark
method called EmbMarker that implants backdoors on embeddings. Our method
selects a group of moderate-frequency words from a general text corpus to form
a trigger set, then selects a target embedding as the watermark, and inserts it
into the embeddings of texts containing trigger words as the backdoor. The
weight of insertion is proportional to the number of trigger words included in
the text. This allows the watermark backdoor to be effectively transferred to
EaaS-stealer's model for copyright verification while minimizing the adverse
impact on the original embeddings' utility. Our extensive experiments on
various datasets show that our method can effectively protect the copyright of
EaaS models without compromising service quality.
Related papers
- Large Language Model Watermark Stealing With Mixed Integer Programming [51.336009662771396]
Large Language Model (LLM) watermark shows promise in addressing copyright, monitoring AI-generated text, and preventing its misuse.
Recent research indicates that watermarking methods using numerous keys are susceptible to removal attacks.
We propose a novel green list stealing attack against the state-of-the-art LLM watermark scheme.
arXiv Detail & Related papers (2024-05-30T04:11:17Z) - Double-I Watermark: Protecting Model Copyright for LLM Fine-tuning [45.09125828947013]
The proposed approach effectively injects specific watermarking information into the customized model during fine-tuning.
We evaluate the proposed "Double-I watermark" under various fine-tuning methods, demonstrating its harmlessness, robustness, uniqueness, imperceptibility, and validity through both quantitative and qualitative analyses.
arXiv Detail & Related papers (2024-02-22T04:55:14Z) - Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models [63.91178922306669]
We introduce Silent Guardian, a text protection mechanism against large language models (LLMs)
By carefully modifying the text to be protected, TPE can induce LLMs to first sample the end token, thus directly terminating the interaction.
We show that SG can effectively protect the target text under various configurations and achieve almost 100% protection success rate in some cases.
arXiv Detail & Related papers (2023-12-15T10:30:36Z) - A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks.
There are concerns that LLMs are possible to be used improperly or even illegally.
We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z) - Watermarking Vision-Language Pre-trained Models for Multi-modal
Embedding as a Service [19.916419258812077]
We propose a robust embedding watermarking method for languages called Marker.
To enhance the watermark, we propose a collaborative copyright verification strategy based on both backdoor trigger and embedding distribution.
arXiv Detail & Related papers (2023-11-10T04:27:27Z) - WASA: WAtermark-based Source Attribution for Large Language
Model-Generated Data [60.759755177369364]
Large language models (LLMs) generate synthetic texts with embedded watermarks that contain information about their source(s)
We propose a WAtermarking for Source Attribution (WASA) framework that satisfies key properties due to our algorithmic designs.
Our framework achieves effective source attribution and data provenance.
arXiv Detail & Related papers (2023-10-01T12:02:57Z) - Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism.
Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs.
We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.