Related papers: EditMF: Drawing an Invisible Fingerprint for Your Large Language Models

EditMF: Drawing an Invisible Fingerprint for Your Large Language Models

URL: http://arxiv.org/abs/2508.08836v1
Date: Tue, 12 Aug 2025 10:52:48 GMT
Title: EditMF: Drawing an Invisible Fingerprint for Your Large Language Models
Authors: Jiaxuan Wu, Yinghan Zhou, Wanli Peng, Yiming Xue, Juan Wen, Ping Zhong,
Abstract summary: EditMF is a training-free fingerprinting paradigm that achieves highly imperceptible fingerprint embedding with minimal computational overhead.<n>We show that EditMF combines high imperceptibility with negligible model's performance loss, while delivering robustness far beyond LoRA-based fingerprinting.
Score: 11.691985114214162
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training large language models (LLMs) is resource-intensive and expensive, making protecting intellectual property (IP) for LLMs crucial. Recently, embedding fingerprints into LLMs has emerged as a prevalent method for establishing model ownership. However, existing back-door-based methods suffer from limited stealth and efficiency. To simultaneously address these issues, we propose EditMF, a training-free fingerprinting paradigm that achieves highly imperceptible fingerprint embedding with minimal computational overhead. Ownership bits are mapped to compact, semantically coherent triples drawn from an encrypted artificial knowledge base (e.g., virtual author-novel-protagonist facts). Causal tracing localizes the minimal set of layers influencing each triple, and a zero-space update injects the fingerprint without perturbing unrelated knowledge. Verification requires only a single black-box query and succeeds when the model returns the exact pre-embedded protagonist. Empirical results on LLaMA and Qwen families show that EditMF combines high imperceptibility with negligible model's performance loss, while delivering robustness far beyond LoRA-based fingerprinting and approaching that of SFT embeddings. Extensive experiments demonstrate that EditMF is an effective and low-overhead solution for secure LLM ownership verification.

Related papers

Inhibitory Attacks on Backdoor-based Fingerprinting for Large Language Models [14.909356150499297]
We propose two novel fingerprinting attack methods: token filter attack (TFA) and sentence verification attack (SVA)<n>The proposed methods effectively inhibit the fingerprint response while maintaining ensemble performance. Compared with state-of-the-art attack methods, the proposed method can achieve better performance.
arXiv Detail & Related papers (2026-01-07T06:06:56Z)
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification [22.052342142871144]
iSeal is a fingerprinting method designed for reliable verification when the model thief controls the suspected LLM in an end-to-end manner.<n>It injects unique features into both the model and an external module, reinforced by an error-correction mechanism and a similarity-based verification strategy.<n>iSeal achieves 100 percent Fingerprint Success Rate on 12 LLMs against more than 10 attacks, while baselines fail under unlearning and response manipulations.
arXiv Detail & Related papers (2025-11-12T02:30:19Z)
From Injection to Defense: Constructing Edit-Based Fingerprints for Large Language Models [28.393476667026523]
We propose RFEdit, a knowledge-editing framework that embeds a rule-based multilingual natural language fingerprint (MNLF) by modifying a sparse subset of model weights.<n>RFEdit is protected by Fingerprint Subspace-aware Fine-Tuning (FSFT), which mitigates fingerprint degradation during legitimate fine-tuning.
arXiv Detail & Related papers (2025-09-03T08:22:04Z)
Lethe: Purifying Backdoored Large Language Models with Knowledge Dilution [49.78359632298156]
Large language models (LLMs) have seen significant advancements, achieving superior performance in various Natural Language Processing (NLP) tasks.<n> backdoor attacks, where models behave normally for standard queries but generate harmful responses or unintended output when specific triggers are activated.<n>We present LETHE, a novel method to eliminate backdoor behaviors from LLMs through knowledge dilution.
arXiv Detail & Related papers (2025-08-28T17:05:18Z)
SoK: Large Language Model Copyright Auditing via Fingerprinting [69.14570598973195]
We introduce a unified framework and formal taxonomy that categorizes existing methods into white-box and black-box approaches.<n>We propose LeaFBench, the first systematic benchmark for evaluating LLM fingerprinting under realistic deployment scenarios.
arXiv Detail & Related papers (2025-08-27T12:56:57Z)
FPEdit: Robust LLM Fingerprinting through Localized Knowledge Editing [9.351260848685229]
FPEdit is a novel knowledge-editing framework that injects semantically coherent natural language fingerprints by modifying a sparse subset of model weights.<n> experiments show that FPEdit achieves $95$-$100%$ fingerprint retention.<n> FPEdit can embed 10 fingerprint pairs into LLaMA2-7B in under 10 minutes using less than 32 GB of GPU memory.
arXiv Detail & Related papers (2025-08-04T06:00:22Z)
DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection [9.849635250118913]
Large language models (LLMs) are considered valuable Intellectual Properties (IP) for legitimate owners.<n>We propose DuFFin, a novel $textbfDu$al-Level $textbfFin$gerprinting $textbfF$ramework for black-box setting ownership verification.
arXiv Detail & Related papers (2025-05-22T11:16:46Z)
ImF: Implicit Fingerprint for Large Language Models [0.0]
We introduce a novel adversarial attack named Generation Revision Intervention (GRI) attack.<n>GRI exploits the semantic fragility of current fingerprinting methods, effectively erasing fingerprints.<n>We propose a novel model fingerprint paradigm called Implicit Fingerprints (ImF)
arXiv Detail & Related papers (2025-03-25T05:47:34Z)
LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z)
REEF: Representation Encoding Fingerprints for Large Language Models [53.679712605506715]
REEF computes and compares the centered kernel alignment similarity between the representations of a suspect model and a victim model. This training-free REEF does not impair the model's general capabilities and is robust to sequential fine-tuning, pruning, model merging, and permutations.
arXiv Detail & Related papers (2024-10-18T08:27:02Z)
A Fingerprint for Large Language Models [10.63985246068255]
We propose a novel black-box fingerprinting technique for large language models (LLMs) Experimental results indicate that the proposed technique achieves superior performance in ownership verification and robustness against PEFT attacks.
arXiv Detail & Related papers (2024-07-01T12:25:42Z)
DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing. We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z)
Instructional Fingerprinting of Large Language Models [57.72356846657551]
We present a pilot study on fingerprinting Large language models (LLMs) as a form of very lightweight instruction tuning. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.
arXiv Detail & Related papers (2024-01-21T09:51:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.