Related papers: ImF: Implicit Fingerprint for Large Language Models

ImF: Implicit Fingerprint for Large Language Models

URL: http://arxiv.org/abs/2503.21805v2
Date: Sat, 17 May 2025 23:00:12 GMT
Title: ImF: Implicit Fingerprint for Large Language Models
Authors: Wu jiaxuan, Peng Wanli, Fu hang, Xue Yiming, Wen juan,
Abstract summary: We introduce a novel adversarial attack named Generation Revision Intervention (GRI) attack.<n>GRI exploits the semantic fragility of current fingerprinting methods, effectively erasing fingerprints.<n>We propose a novel model fingerprint paradigm called Implicit Fingerprints (ImF)
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training large language models (LLMs) is resource-intensive and expensive, making protecting intellectual property (IP) for LLMs crucial. Recently, embedding fingerprints into LLMs has emerged as a prevalent method for establishing model ownership. However, existing fingerprinting techniques typically embed identifiable patterns with weak semantic coherence, resulting in fingerprints that significantly differ from the natural question-answering (QA) behavior inherent to LLMs. This discrepancy undermines the stealthiness of the embedded fingerprints and makes them vulnerable to adversarial attacks. In this paper, we first demonstrate the critical vulnerability of existing fingerprint embedding methods by introducing a novel adversarial attack named Generation Revision Intervention (GRI) attack. GRI attack exploits the semantic fragility of current fingerprinting methods, effectively erasing fingerprints by disrupting their weakly correlated semantic structures. Our empirical evaluation highlights that traditional fingerprinting approaches are significantly compromised by the GRI attack, revealing severe limitations in their robustness under realistic adversarial conditions. To advance the state-of-the-art in model fingerprinting, we propose a novel model fingerprint paradigm called Implicit Fingerprints (ImF). ImF leverages steganography techniques to subtly embed ownership information within natural texts, subsequently using Chain-of-Thought (CoT) prompting to construct semantically coherent and contextually natural QA pairs. This design ensures that fingerprints seamlessly integrate with the standard model behavior, remaining indistinguishable from regular outputs and substantially reducing the risk of accidental triggering and targeted removal. We conduct a comprehensive evaluation of ImF on 15 diverse LLMs, spanning different architectures and varying scales.

Related papers

MEraser: An Effective Fingerprint Erasure Approach for Large Language Models [19.8112399985437]
Large Language Models (LLMs) have become increasingly prevalent across various sectors, raising critical concerns about model ownership and intellectual property protection.<n>We present Mismatched Eraser (MEraser), a novel method for effectively removing backdoor-based fingerprints from LLMs while maintaining model performance.
arXiv Detail & Related papers (2025-06-14T15:48:53Z)
RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models [12.459241957411669]
RAP-SM is a novel framework that extracts a public fingerprint for an entire series of large language models.<n> Experimental results demonstrate that RAP-SM effectively captures the intrinsic commonalities among different models.
arXiv Detail & Related papers (2025-05-08T03:21:58Z)
Scalable Fingerprinting of Large Language Models [46.26999419117367]
We introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints.<n>We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model without degrading the model's utility.
arXiv Detail & Related papers (2025-02-11T18:43:07Z)
Invisible Traces: Using Hybrid Fingerprinting to identify underlying LLMs in GenAI Apps [0.0]
Fingerprinting of Large Language Models (LLMs) has become essential for ensuring the security and transparency of AI-integrated applications.<n>We introduce a novel fingerprinting framework designed to address these challenges by integrating static and dynamic fingerprinting techniques.<n>Our approach identifies architectural features and behavioral traits, enabling accurate and robust fingerprinting of LLMs in dynamic environments.
arXiv Detail & Related papers (2025-01-30T19:15:41Z)
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint [29.015707553430442]
Model fingerprinting is a widely adopted approach to safeguard the intellectual property rights of open-source models. In this paper, we reveal that they are vulnerable to false claim attacks where adversaries falsely assert ownership of any third-party model. Motivated by these findings, we propose a targeted fingerprinting paradigm (i.e., FIT-Print) to counteract false claim attacks.
arXiv Detail & Related papers (2025-01-26T13:00:58Z)
Sample Correlation for Fingerprinting Deep Face Recognition [83.53005932513156]
We propose a novel model stealing detection method based on SA Corremplelation (SAC)<n>SAC successfully defends against various model stealing attacks in deep face recognition, encompassing face verification and face emotion recognition, exhibiting the highest performance in terms of AUC, p-value and F1 score.<n>We extend our evaluation of SAC-JC to object recognition including Tiny-ImageNet and CIFAR10, which also demonstrates the superior performance of SAC-JC to previous methods.
arXiv Detail & Related papers (2024-12-30T07:37:06Z)
UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification [23.164580168870682]
Fingerprinting large language models (LLMs) is essential for verifying model ownership, ensuring authenticity, and preventing misuse. In this paper, we introduce a novel and efficient approach to fingerprinting LLMs by leveraging under-trained tokens. Our method has minimal overhead and impact on model's performance, and does not require white-box access to target model's ownership identification.
arXiv Detail & Related papers (2024-10-16T07:36:57Z)
MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models [1.9249287163937978]
We propose a novel fingerprinting method, MergePrint, to embed robust fingerprints capable of surviving model merging.<n> MergePrint enables black-box ownership verification, where owners only need to check if a model produces target outputs for specific fingerprint inputs.
arXiv Detail & Related papers (2024-10-11T08:00:49Z)
FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition [11.885529039351217]
We introduce FP-VEC, a pilot study on using fingerprint vectors as an efficient fingerprinting method for Large Language Models. Our approach generates a fingerprint vector that represents a confidential signature embedded in the model, allowing the same fingerprint to be seamlessly incorporated into an unlimited number of LLMs. Results on several LLMs show that FP-VEC is lightweight by running on CPU-only devices for fingerprinting, scalable with a single training and unlimited fingerprinting process, and preserves the model's normal behavior.
arXiv Detail & Related papers (2024-09-13T14:04:39Z)
Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique [2.7174461714624805]
Growing concerns over the theft and misuse of Large Language Models (LLMs) have heightened the need for effective fingerprinting.<n>We define five key properties for a successful fingerprint: Transparency, Efficiency, Persistence, Robustness, and Unforgeability.<n>We introduce a novel fingerprinting framework that provides verifiable proof of ownership while maintaining fingerprint integrity.
arXiv Detail & Related papers (2024-07-15T16:38:56Z)
Instructional Fingerprinting of Large Language Models [57.72356846657551]
We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
arXiv Detail & Related papers (2024-01-21T09:51:45Z)
HuRef: HUman-REadable Fingerprint for Large Language Models [44.9820558213721]
HuRef is a human-readable fingerprint for large language models. It uniquely identifies the base model without interfering with training or exposing model parameters to the public.
arXiv Detail & Related papers (2023-12-08T05:01:47Z)
In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks. Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks. We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z)
Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint. We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC) SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z)
Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations. We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z)
Fingerprinting Image-to-Image Generative Adversarial Networks [53.02510603622128]
Generative Adversarial Networks (GANs) have been widely used in various application scenarios. This paper presents a novel fingerprinting scheme for the Intellectual Property protection of image-to-image GANs based on a trusted third party.
arXiv Detail & Related papers (2021-06-19T06:25:10Z)
Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs) Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation. We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z)
Temporal Sparse Adversarial Attack on Sequence-based Gait Recognition [56.844587127848854]
We demonstrate that the state-of-the-art gait recognition model is vulnerable to such attacks. We employ a generative adversarial network based architecture to semantically generate adversarial high-quality gait silhouettes or video frames. The experimental results show that if only one-fortieth of the frames are attacked, the accuracy of the target model drops dramatically.
arXiv Detail & Related papers (2020-02-22T10:08:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.