Related papers: DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection

DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection

URL: http://arxiv.org/abs/2505.16530v1
Date: Thu, 22 May 2025 11:16:46 GMT
Title: DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection
Authors: Yuliang Yan, Haochun Tang, Shuo Yan, Enyan Dai,
Abstract summary: Large language models (LLMs) are considered valuable Intellectual Properties (IP) for legitimate owners.<n>We propose DuFFin, a novel $textbfDu$al-Level $textbfFin$gerprinting $textbfF$ramework for black-box setting ownership verification.
Score: 9.849635250118913
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models (LLMs) are considered valuable Intellectual Properties (IP) for legitimate owners due to the enormous computational cost of training. It is crucial to protect the IP of LLMs from malicious stealing or unauthorized deployment. Despite existing efforts in watermarking and fingerprinting LLMs, these methods either impact the text generation process or are limited in white-box access to the suspect model, making them impractical. Hence, we propose DuFFin, a novel $\textbf{Du}$al-Level $\textbf{Fin}$gerprinting $\textbf{F}$ramework for black-box setting ownership verification. DuFFin extracts the trigger pattern and the knowledge-level fingerprints to identify the source of a suspect model. We conduct experiments on a variety of models collected from the open-source website, including four popular base models as protected LLMs and their fine-tuning, quantization, and safety alignment versions, which are released by large companies, start-ups, and individual users. Results show that our method can accurately verify the copyright of the base protected LLM on their model variants, achieving the IP-ROC metric greater than 0.95. Our code is available at https://github.com/yuliangyan0807/llm-fingerprint.

Related papers

REEF: Representation Encoding Fingerprints for Large Language Models [53.679712605506715]
REEF computes and compares the centered kernel alignment similarity between the representations of a suspect model and a victim model. This training-free REEF does not impair the model's general capabilities and is robust to sequential fine-tuning, pruning, model merging, and permutations.
arXiv Detail & Related papers (2024-10-18T08:27:02Z)
UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification [9.780530666330007]
Fingerprinting large language models (LLMs) is essential for verifying model ownership, ensuring authenticity, and preventing misuse.<n>In this paper, we introduce a novel and efficient approach to fingerprinting LLMs by leveraging under-trained tokens.<n>Our method has minimal overhead and impact on model's performance, and does not require white-box access to target model's ownership identification.
arXiv Detail & Related papers (2024-10-16T07:36:57Z)
FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition [11.885529039351217]
We introduce FP-VEC, a pilot study on using fingerprint vectors as an efficient fingerprinting method for Large Language Models. Our approach generates a fingerprint vector that represents a confidential signature embedded in the model, allowing the same fingerprint to be seamlessly incorporated into an unlimited number of LLMs. Results on several LLMs show that FP-VEC is lightweight by running on CPU-only devices for fingerprinting, scalable with a single training and unlimited fingerprinting process, and preserves the model's normal behavior.
arXiv Detail & Related papers (2024-09-13T14:04:39Z)
ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models [18.46904928949022]
We propose ProFLingo, a black-box fingerprinting-based IP protection scheme for large language models (LLMs) ProFLingo generates queries that elicit specific responses from an original model, thereby establishing unique fingerprints. Our scheme assesses the effectiveness of these queries on a suspect model to determine whether it has been derived from the original model.
arXiv Detail & Related papers (2024-05-03T20:00:40Z)
Logits of API-Protected LLMs Leak Proprietary Information [46.014638838911566]
Large language model (LLM) providers often hide the architectural details and parameters of their proprietary models by restricting public access to a limited API. We show that it is possible to learn a surprisingly large amount of non-public information about an API-protected LLM from a relatively small number of API queries.
arXiv Detail & Related papers (2024-03-14T16:27:49Z)
Instructional Fingerprinting of Large Language Models [57.72356846657551]
We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
arXiv Detail & Related papers (2024-01-21T09:51:45Z)
HuRef: HUman-REadable Fingerprint for Large Language Models [44.9820558213721]
HuRef is a human-readable fingerprint for large language models.<n>It uniquely identifies the base model without interfering with training or exposing model parameters to the public.
arXiv Detail & Related papers (2023-12-08T05:01:47Z)
Who Leaked the Model? Tracking IP Infringers in Accountable Federated Learning [51.26221422507554]
Federated learning (FL) is an effective collaborative learning framework to coordinate data and computation resources from massive and distributed clients in training. Such collaboration results in non-trivial intellectual property (IP) represented by the model parameters that should be protected and shared by the whole party rather than an individual user. To block such IP leakage, it is essential to make the IP identifiable in the shared model and locate the anonymous infringer who first leaks it. We propose Decodable Unique Watermarking (DUW) for complying with the requirements of accountable FL.
arXiv Detail & Related papers (2023-12-06T00:47:55Z)
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark [58.60940048748815]
Companies have begun to offer Embedding as a Service (E) based on large language models (LLMs) E is vulnerable to model extraction attacks, which can cause significant losses for the owners of LLMs. We propose an Embedding Watermark method called EmbMarker that implants backdoors on embeddings.
arXiv Detail & Related papers (2023-05-17T08:28:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.