Instructional Fingerprinting of Large Language Models
- URL: http://arxiv.org/abs/2401.12255v2
- Date: Wed, 3 Apr 2024 06:23:34 GMT
- Title: Instructional Fingerprinting of Large Language Models
- Authors: Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen,
- Abstract summary: We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning.
Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model.
It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
- Score: 57.72356846657551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e.g. restricting commercial use). In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License. Code is available in https://cnut1648.github.io/Model-Fingerprint/.
Related papers
- A Fingerprint for Large Language Models [10.63985246068255]
We propose a novel black-box fingerprinting technique for large language models (LLMs)
Experimental results indicate that the proposed technique achieves superior performance in ownership verification and robustness against PEFT attacks.
arXiv Detail & Related papers (2024-07-01T12:25:42Z) - ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models [18.46904928949022]
We propose ProFLingo, a black-box fingerprinting-based IP protection scheme for large language models (LLMs)
ProFLingo generates queries that elicit specific responses from an original model, thereby establishing unique fingerprints.
Our scheme assesses the effectiveness of these queries on a suspect model to determine whether it has been derived from the original model.
arXiv Detail & Related papers (2024-05-03T20:00:40Z) - EncryIP: A Practical Encryption-Based Framework for Model Intellectual
Property Protection [17.655627250882805]
This paper introduces a practical encryption-based framework called textitEncryIP.
It seamlessly integrates a public-key encryption scheme into the model learning process.
It demonstrates superior effectiveness in both training protected models and efficiently detecting the unauthorized spread of ML models.
arXiv Detail & Related papers (2023-12-19T11:11:03Z) - Human-Readable Fingerprint for Large Language Models [47.952699246648045]
We introduce a human-readable fingerprint for large language models (LLMs)
Our method generates a dog image as an identity fingerprint for an LLM, where the dog's appearance strongly indicates the LLM's base model.
arXiv Detail & Related papers (2023-12-08T05:01:47Z) - Robust Retraining-free GAN Fingerprinting via Personalized Normalization [21.63902009635896]
The proposed method can embed different fingerprints inside the GAN by just changing the input of the ParamGen Nets.
The performance of the proposed method in terms of robustness against both model-level and image-level attacks is superior to the state-of-the-art.
arXiv Detail & Related papers (2023-11-09T16:09:12Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Are You Copying My Model? Protecting the Copyright of Large Language
Models for EaaS via Backdoor Watermark [58.60940048748815]
Companies have begun to offer Embedding as a Service (E) based on large language models (LLMs)
E is vulnerable to model extraction attacks, which can cause significant losses for the owners of LLMs.
We propose an Embedding Watermark method called EmbMarker that implants backdoors on embeddings.
arXiv Detail & Related papers (2023-05-17T08:28:54Z) - Toward a Theory of Causation for Interpreting Neural Code Models [49.906221295459275]
This paper introduces $do_code$, a post hoc interpretability method specific to Neural Code Models (NCMs)
$do_code$ is based upon causal inference to enable language-oriented explanations.
Results show that our studied NCMs are sensitive to changes in code syntax.
arXiv Detail & Related papers (2023-02-07T22:56:58Z) - Watermarking Pre-trained Language Models with Backdooring [118.14981787949199]
We show that PLMs can be watermarked with a multi-task learning framework by embedding backdoors triggered by specific inputs defined by the owners.
In addition to using some rare words as triggers, we also show that the combination of common words can be used as backdoor triggers to avoid them being easily detected.
arXiv Detail & Related papers (2022-10-14T05:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.