Related papers: Instructional Fingerprinting of Large Language Models

Instructional Fingerprinting of Large Language Models

URL: http://arxiv.org/abs/2401.12255v2
Date: Wed, 3 Apr 2024 06:23:34 GMT
Title: Instructional Fingerprinting of Large Language Models
Authors: Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen,
Abstract summary: We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
Score: 57.72356846657551
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e.g. restricting commercial use). In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License. Code is available in https://cnut1648.github.io/Model-Fingerprint/.

Related papers

ImF: Implicit Fingerprint for Large Language Models [0.0]
We propose a novel injected fingerprint paradigm called Implicit Fingerprints (ImF) ImF constructs fingerprint pairs with strong semantic correlations, disguising them as natural question-answer pairs within large language models (LLMs) Our experiment on multiple LLMs demonstrates that ImF retains high verification success rates under adversarial conditions.
arXiv Detail & Related papers (2025-03-25T05:47:34Z)
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models [52.439289085318634]
We show how to identify training data known to proprietary large language models (LLMs) by using information-guided probes. Our work builds on a key observation: text passages with high surprisal are good search material for memorization probes.
arXiv Detail & Related papers (2025-03-15T10:19:15Z)
REEF: Representation Encoding Fingerprints for Large Language Models [53.679712605506715]
REEF computes and compares the centered kernel alignment similarity between the representations of a suspect model and a victim model. This training-free REEF does not impair the model's general capabilities and is robust to sequential fine-tuning, pruning, model merging, and permutations.
arXiv Detail & Related papers (2024-10-18T08:27:02Z)
UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification [23.164580168870682]
Fingerprinting large language models (LLMs) is essential for verifying model ownership, ensuring authenticity, and preventing misuse. In this paper, we introduce a novel and efficient approach to fingerprinting LLMs by leveraging under-trained tokens. Our method has minimal overhead and impact on model's performance, and does not require white-box access to target model's ownership identification.
arXiv Detail & Related papers (2024-10-16T07:36:57Z)
MergePrint: Robust Fingerprinting against Merging Large Language Models [1.9249287163937978]
We propose a novel fingerprinting method MergePrint that embeds robust fingerprints designed to preserve ownership claims even after model merging. By optimizing against a pseudo-merged model, MergePrint generates fingerprints that remain detectable after merging. This approach provides a practical fingerprinting strategy for asserting ownership in cases of misappropriation through model merging.
arXiv Detail & Related papers (2024-10-11T08:00:49Z)
FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition [11.885529039351217]
We introduce FP-VEC, a pilot study on using fingerprint vectors as an efficient fingerprinting method for Large Language Models. Our approach generates a fingerprint vector that represents a confidential signature embedded in the model, allowing the same fingerprint to be seamlessly incorporated into an unlimited number of LLMs. Results on several LLMs show that FP-VEC is lightweight by running on CPU-only devices for fingerprinting, scalable with a single training and unlimited fingerprinting process, and preserves the model's normal behavior.
arXiv Detail & Related papers (2024-09-13T14:04:39Z)
ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models [18.46904928949022]
We propose ProFLingo, a black-box fingerprinting-based IP protection scheme for large language models (LLMs) ProFLingo generates queries that elicit specific responses from an original model, thereby establishing unique fingerprints. Our scheme assesses the effectiveness of these queries on a suspect model to determine whether it has been derived from the original model.
arXiv Detail & Related papers (2024-05-03T20:00:40Z)
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation. CodeIP is a novel multi-bit watermarking technique that embeds additional information to preserve provenance details. Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z)
HuRef: HUman-REadable Fingerprint for Large Language Models [44.9820558213721]
HuRef is a human-readable fingerprint for large language models. It uniquely identifies the base model without interfering with training or exposing model parameters to the public.
arXiv Detail & Related papers (2023-12-08T05:01:47Z)
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark [58.60940048748815]
Companies have begun to offer Embedding as a Service (E) based on large language models (LLMs) E is vulnerable to model extraction attacks, which can cause significant losses for the owners of LLMs. We propose an Embedding Watermark method called EmbMarker that implants backdoors on embeddings.
arXiv Detail & Related papers (2023-05-17T08:28:54Z)
Toward a Theory of Causation for Interpreting Neural Code Models [49.906221295459275]
This paper introduces $do_code$, a post hoc interpretability method specific to Neural Code Models (NCMs) $do_code$ is based upon causal inference to enable language-oriented explanations. Results show that our studied NCMs are sensitive to changes in code syntax.
arXiv Detail & Related papers (2023-02-07T22:56:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.