From Injection to Defense: Constructing Edit-Based Fingerprints for Large Language Models
- URL: http://arxiv.org/abs/2509.03122v2
- Date: Wed, 08 Oct 2025 16:23:32 GMT
- Title: From Injection to Defense: Constructing Edit-Based Fingerprints for Large Language Models
- Authors: Yue Li, Xin Yi, Dongsheng Shi, Yongyi Cui, Gerard de Melo, Linlin Wang,
- Abstract summary: We propose RFEdit, a knowledge-editing framework that embeds a rule-based multilingual natural language fingerprint (MNLF) by modifying a sparse subset of model weights.<n>RFEdit is protected by Fingerprint Subspace-aware Fine-Tuning (FSFT), which mitigates fingerprint degradation during legitimate fine-tuning.
- Score: 28.393476667026523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fingerprinting is critical for maintaining traceability and protecting the intellectual property (IP) of developers, as LLMs deployed in web applications are susceptible to unauthorized redistribution and misuse via fine-tuning or black-box deployment. However, current backdoor-based fingerprinting methods face a fundamental trade-off: fingerprints embedded as garbled text are easily detected and filtered, whereas those crafted as coherent natural language are prone to being triggered unintentionally. To overcome these limitations, we propose RFEdit, a knowledge-editing framework that embeds a rule-based multilingual natural language fingerprint (MNLF) by modifying a sparse subset of model weights. This approach enables efficient and robust fingerprint injection with minimal impact on unrelated knowledge in LLMs. Our RFEdit framework is further safeguarded by Fingerprint Subspace-aware Fine-Tuning (FSFT), which mitigates fingerprint degradation during legitimate fine-tuning by restricting parameter updates to the fingerprint subspace. This approach preserves fingerprint integrity while enhancing downstream task performance of LLMs. These advances establish a comprehensive pipeline from fingerprint injection to defense, achieving high detection effectiveness, robustness against adversarial manipulations, harmlessness to model utility, and persistence under fine-tuning. Extensive experiments demonstrate that RFEdit maintains robustness under quantization and pruning. Additionally, fingerprint effectiveness is generally improved by more than 10\% when combined with FSFT for math and alpaca downstream tasks.
Related papers
- A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors [43.11304710234668]
We introduce a novel fingerprinting framework that leverages the behavioral patterns induced by safety alignment.<n>In a large-scale identification task across 76 offspring models, our method achieves 100% accuracy in identifying the correct base model family.<n>We propose a theoretical framework to transform this private fingerprint into a publicly verifiable, privacy-preserving artifact.
arXiv Detail & Related papers (2026-02-10T05:57:35Z) - Antidistillation Fingerprinting [119.66677613290359]
We introduce antidistillation fingerprinting (ADFP), a principled approach that aligns the fingerprinting objective with the student's learning dynamics.<n>ADFP achieves a significant improvement over state-of-the-art baselines, stronger detection confidence with minimal impact on utility, even when the student model's architecture is unknown.
arXiv Detail & Related papers (2026-02-03T18:15:50Z) - DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection [21.422855789542695]
We propose a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers.<n>Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility.
arXiv Detail & Related papers (2026-01-13T05:05:37Z) - SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting [4.335948336782789]
We propose a novel intrinsic weight-based fingerprinting scheme that eliminates dependency on input and inherently resists false claims.<n> SELF achieves robust IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective neural network-based fingerprint similarity comparison based on few-shot learning and data augmentation.
arXiv Detail & Related papers (2025-12-03T09:53:47Z) - GateRA: Token-Aware Modulation for Parameter-Efficient Fine-Tuning [51.79350934271497]
GateRA is a unified framework that introduces token-aware modulation to dynamically adjust the strength of PEFT updates.<n>By incorporating adaptive gating into standard PEFT branches, GateRA enables selective, token-level adaptation.<n> Experiments on multiple commonsense reasoning benchmarks demonstrate that GateRA consistently outperforms or matches prior PEFT methods.
arXiv Detail & Related papers (2025-11-15T17:55:47Z) - SWAP: Towards Copyright Auditing of Soft Prompts via Sequential Watermarking [58.475471437150674]
We propose sequential watermarking for soft prompts (SWAP)<n>SWAP encodes watermarks through a specific order of defender-specified out-of-distribution classes.<n>Experiments on 11 datasets demonstrate SWAP's effectiveness, harmlessness, and robustness against potential adaptive attacks.
arXiv Detail & Related papers (2025-11-05T13:48:48Z) - EditMF: Drawing an Invisible Fingerprint for Your Large Language Models [11.691985114214162]
EditMF is a training-free fingerprinting paradigm that achieves highly imperceptible fingerprint embedding with minimal computational overhead.<n>We show that EditMF combines high imperceptibility with negligible model's performance loss, while delivering robustness far beyond LoRA-based fingerprinting.
arXiv Detail & Related papers (2025-08-12T10:52:48Z) - FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing [24.648168413166673]
FPEdit is a novel framework that leverages knowledge editing to inject semantically coherent natural language fingerprints.<n>We show that FPEdit achieves 95-100% fingerprint retention under both full- parameter fine-tuning and parameter-efficient adaptation.<n> FPEdit can embed 10 fingerprint pairs into LLaMA2-7B in under 2 minutes using less than 30 GB of GPU memory.
arXiv Detail & Related papers (2025-08-04T06:00:22Z) - Robust Anti-Backdoor Instruction Tuning in LVLMs [53.766434746801366]
We introduce a lightweight, certified-agnostic defense framework for large visual language models (LVLMs)<n>Our framework finetunes only adapter modules and text embedding layers under instruction tuning.<n>Experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero.
arXiv Detail & Related papers (2025-06-04T01:23:35Z) - ImF: Implicit Fingerprint for Large Language Models [14.580290415247385]
We introduce a novel adversarial attack named Generation Revision Intervention (GRI) attack.<n>GRI exploits the semantic fragility of current fingerprinting methods, effectively erasing fingerprints.<n>We propose a novel model fingerprint paradigm called Implicit Fingerprints (ImF)
arXiv Detail & Related papers (2025-03-25T05:47:34Z) - Scalable Fingerprinting of Large Language Models [46.26999419117367]
We introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints.<n>We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model without degrading the model's utility.
arXiv Detail & Related papers (2025-02-11T18:43:07Z) - Fingerprint Vector: Enabling Scalable and Efficient Model Fingerprint Transfer via Vector Addition [23.282821424581]
We propose a novel mechanism called the Fingerprint Vector.<n>It embeds a fingerprint into the base model via backdoor-based fine-tuning, then extracts a task-specific parameter delta as a fingerprint vector.<n>It achieves comparable or superior performance to direct injection across key desiderata.
arXiv Detail & Related papers (2024-09-13T14:04:39Z) - Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique [2.7174461714624805]
Growing concerns over the theft and misuse of Large Language Models (LLMs) have heightened the need for effective fingerprinting.<n>We define five key properties for a successful fingerprint: Transparency, Efficiency, Persistence, Robustness, and Unforgeability.<n>We introduce a novel fingerprinting framework that provides verifiable proof of ownership while maintaining fingerprint integrity.
arXiv Detail & Related papers (2024-07-15T16:38:56Z) - Instructional Fingerprinting of Large Language Models [57.72356846657551]
We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning.
Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model.
It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
arXiv Detail & Related papers (2024-01-21T09:51:45Z) - Hierarchical Perceptual Noise Injection for Social Media Fingerprint
Privacy Protection [106.5308793283895]
fingerprint leakage from social media raises a strong desire for anonymizing shared images.
To guard the fingerprint leakage, adversarial attack emerges as a solution by adding imperceptible perturbations on images.
We propose FingerSafe, a hierarchical perceptual protective noise injection framework to address the mentioned problems.
arXiv Detail & Related papers (2022-08-23T02:20:46Z) - Latent Fingerprint Registration via Matching Densely Sampled Points [100.53031290339483]
Existing latent fingerprint registration approaches are mainly based on establishing correspondences between minutiae.
We propose a non-minutia latent fingerprint registration method which estimates the spatial transformation between a pair of fingerprints.
The proposed method achieves the state-of-the-art registration performance, especially under challenging conditions.
arXiv Detail & Related papers (2020-05-12T15:51:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.