Related papers: From Evaluation to Defense: Constructing Persistent Edit-Based Fingerprints for Large Language Models

From Evaluation to Defense: Constructing Persistent Edit-Based Fingerprints for Large Language Models

URL: http://arxiv.org/abs/2509.03122v1
Date: Wed, 03 Sep 2025 08:22:04 GMT
Title: From Evaluation to Defense: Constructing Persistent Edit-Based Fingerprints for Large Language Models
Authors: Yue Li, Xin Yi, Dongsheng Shi, Yongyi Cui, Gerard de Melo, Xiaoling Wang, Linlin Wang,
Abstract summary: Injecting specialized fingerprints into Large Language Models (LLMs) through instruction tuning is a common IP protection technique.<n>We argue that knowledge editing offers a lightweight alternative that is more suitable for fingerprint injection.<n>We propose Fingerprint Subspace-aware Fine-Tuning (FSFT), which reduces fingerprint degradation by constraining the update of the fingerprint subspace.
Score: 40.79429403341075
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The intellectual property (IP) protection of Large Language Models (LLMs) is increasingly critical. Injecting specialized fingerprints into LLMs through instruction tuning is a common IP protection technique. However, this may significantly degrade model performance, requires substantial computational resources, and exhibits poor persistence under model modifications. We argue that knowledge editing offers a lightweight alternative that is more suitable for fingerprint injection. Accordingly, we apply knowledge editing to fingerprint injection for the first time and demonstrate its strong capability. Despite using scrambled text as fingerprints to prevent them from being overwritten during fine-tuning, degradation still occurs under large-scale fine-tuning. To address this, we propose Fingerprint Subspace-aware Fine-Tuning (FSFT), which reduces fingerprint degradation by constraining the update of the fingerprint subspace. The performance of FSFT exceeds fine-tuning by 10% even in the worst-case scenario. Additionally, we observe that the fingerprint-injected models struggle to distinguish between fingerprints and similar texts due to the high similarity of their features. This finding underscores the urgent need for more robust and fine-grained fingerprinting injection methods for LLMs.

Related papers

A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors [43.11304710234668]
We introduce a novel fingerprinting framework that leverages the behavioral patterns induced by safety alignment.<n>In a large-scale identification task across 76 offspring models, our method achieves 100% accuracy in identifying the correct base model family.<n>We propose a theoretical framework to transform this private fingerprint into a publicly verifiable, privacy-preserving artifact.
arXiv Detail & Related papers (2026-02-10T05:57:35Z)
Antidistillation Fingerprinting [119.66677613290359]
We introduce antidistillation fingerprinting (ADFP), a principled approach that aligns the fingerprinting objective with the student's learning dynamics.<n>ADFP achieves a significant improvement over state-of-the-art baselines, stronger detection confidence with minimal impact on utility, even when the student model's architecture is unknown.
arXiv Detail & Related papers (2026-02-03T18:15:50Z)
DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection [21.422855789542695]
We propose a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers.<n>Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility.
arXiv Detail & Related papers (2026-01-13T05:05:37Z)
SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting [4.335948336782789]
We propose a novel intrinsic weight-based fingerprinting scheme that eliminates dependency on input and inherently resists false claims.<n> SELF achieves robust IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective neural network-based fingerprint similarity comparison based on few-shot learning and data augmentation.
arXiv Detail & Related papers (2025-12-03T09:53:47Z)
GateRA: Token-Aware Modulation for Parameter-Efficient Fine-Tuning [51.79350934271497]
GateRA is a unified framework that introduces token-aware modulation to dynamically adjust the strength of PEFT updates.<n>By incorporating adaptive gating into standard PEFT branches, GateRA enables selective, token-level adaptation.<n> Experiments on multiple commonsense reasoning benchmarks demonstrate that GateRA consistently outperforms or matches prior PEFT methods.
arXiv Detail & Related papers (2025-11-15T17:55:47Z)
SWAP: Towards Copyright Auditing of Soft Prompts via Sequential Watermarking [58.475471437150674]
We propose sequential watermarking for soft prompts (SWAP)<n>SWAP encodes watermarks through a specific order of defender-specified out-of-distribution classes.<n>Experiments on 11 datasets demonstrate SWAP's effectiveness, harmlessness, and robustness against potential adaptive attacks.
arXiv Detail & Related papers (2025-11-05T13:48:48Z)
EditMF: Drawing an Invisible Fingerprint for Your Large Language Models [11.691985114214162]
EditMF is a training-free fingerprinting paradigm that achieves highly imperceptible fingerprint embedding with minimal computational overhead.<n>We show that EditMF combines high imperceptibility with negligible model's performance loss, while delivering robustness far beyond LoRA-based fingerprinting.
arXiv Detail & Related papers (2025-08-12T10:52:48Z)
FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing [24.648168413166673]
FPEdit is a novel framework that leverages knowledge editing to inject semantically coherent natural language fingerprints.<n>We show that FPEdit achieves 95-100% fingerprint retention under both full- parameter fine-tuning and parameter-efficient adaptation.<n> FPEdit can embed 10 fingerprint pairs into LLaMA2-7B in under 2 minutes using less than 30 GB of GPU memory.
arXiv Detail & Related papers (2025-08-04T06:00:22Z)
Robust Anti-Backdoor Instruction Tuning in LVLMs [53.766434746801366]
We introduce a lightweight, certified-agnostic defense framework for large visual language models (LVLMs)<n>Our framework finetunes only adapter modules and text embedding layers under instruction tuning.<n>Experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero.
arXiv Detail & Related papers (2025-06-04T01:23:35Z)
ImF: Implicit Fingerprint for Large Language Models [14.580290415247385]
We introduce a novel adversarial attack named Generation Revision Intervention (GRI) attack.<n>GRI exploits the semantic fragility of current fingerprinting methods, effectively erasing fingerprints.<n>We propose a novel model fingerprint paradigm called Implicit Fingerprints (ImF)
arXiv Detail & Related papers (2025-03-25T05:47:34Z)
Scalable Fingerprinting of Large Language Models [46.26999419117367]
We introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints.<n>We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model without degrading the model's utility.
arXiv Detail & Related papers (2025-02-11T18:43:07Z)
Fingerprint Vector: Enabling Scalable and Efficient Model Fingerprint Transfer via Vector Addition [23.282821424581]
We propose a novel mechanism called the Fingerprint Vector.<n>It embeds a fingerprint into the base model via backdoor-based fine-tuning, then extracts a task-specific parameter delta as a fingerprint vector.<n>It achieves comparable or superior performance to direct injection across key desiderata.
arXiv Detail & Related papers (2024-09-13T14:04:39Z)
Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique [2.7174461714624805]
Growing concerns over the theft and misuse of Large Language Models (LLMs) have heightened the need for effective fingerprinting.<n>We define five key properties for a successful fingerprint: Transparency, Efficiency, Persistence, Robustness, and Unforgeability.<n>We introduce a novel fingerprinting framework that provides verifiable proof of ownership while maintaining fingerprint integrity.
arXiv Detail & Related papers (2024-07-15T16:38:56Z)
Instructional Fingerprinting of Large Language Models [57.72356846657551]
We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
arXiv Detail & Related papers (2024-01-21T09:51:45Z)
Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection [106.5308793283895]
fingerprint leakage from social media raises a strong desire for anonymizing shared images. To guard the fingerprint leakage, adversarial attack emerges as a solution by adding imperceptible perturbations on images. We propose FingerSafe, a hierarchical perceptual protective noise injection framework to address the mentioned problems.
arXiv Detail & Related papers (2022-08-23T02:20:46Z)
Latent Fingerprint Registration via Matching Densely Sampled Points [100.53031290339483]
Existing latent fingerprint registration approaches are mainly based on establishing correspondences between minutiae. We propose a non-minutia latent fingerprint registration method which estimates the spatial transformation between a pair of fingerprints. The proposed method achieves the state-of-the-art registration performance, especially under challenging conditions.
arXiv Detail & Related papers (2020-05-12T15:51:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.