Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model!
- URL: http://arxiv.org/abs/2507.03014v1
- Date: Wed, 02 Jul 2025 12:29:38 GMT
- Title: Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model!
- Authors: Do-hyeon Yoon, Minsoo Chun, Thomas Allen, Hans Müller, Min Wang, Rajesh Sharma,
- Abstract summary: Large language models (LLMs) face significant copyright and intellectual property challenges as the cost of training increases and model reuse becomes prevalent.<n>This work introduces a simple yet effective approach for robust fingerprinting based on intrinsic model characteristics.
- Score: 1.8824463630667776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) face significant copyright and intellectual property challenges as the cost of training increases and model reuse becomes prevalent. While watermarking techniques have been proposed to protect model ownership, they may not be robust to continue training and development, posing serious threats to model attribution and copyright protection. This work introduces a simple yet effective approach for robust LLM fingerprinting based on intrinsic model characteristics. We discover that the standard deviation distributions of attention parameter matrices across different layers exhibit distinctive patterns that remain stable even after extensive continued training. These parameter distribution signatures serve as robust fingerprints that can reliably identify model lineage and detect potential copyright infringement. Our experimental validation across multiple model families demonstrates the effectiveness of our method for model authentication. Notably, our investigation uncovers evidence that a recently Pangu Pro MoE model released by Huawei is derived from Qwen-2.5 14B model through upcycling techniques rather than training from scratch, highlighting potential cases of model plagiarism, copyright violation, and information fabrication. These findings underscore the critical importance of developing robust fingerprinting methods for protecting intellectual property in large-scale model development and emphasize that deliberate continued training alone is insufficient to completely obscure model origins.
Related papers
- Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features [54.63343151319368]
This paper proposes a harmless model ownership verification method for personalized models by decoupling similar common features.<n>In the first stage, we create shadow models that retain common features of the victim model while disrupting dataset-specific features.<n>After that, a meta-classifier is trained to identify stolen models by determining whether suspicious models contain the dataset-specific features of the victim.
arXiv Detail & Related papers (2025-06-24T15:40:11Z) - MEraser: An Effective Fingerprint Erasure Approach for Large Language Models [19.8112399985437]
Large Language Models (LLMs) have become increasingly prevalent across various sectors, raising critical concerns about model ownership and intellectual property protection.<n>We present Mismatched Eraser (MEraser), a novel method for effectively removing backdoor-based fingerprints from LLMs while maintaining model performance.
arXiv Detail & Related papers (2025-06-14T15:48:53Z) - RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models [12.459241957411669]
RAP-SM is a novel framework that extracts a public fingerprint for an entire series of large language models.<n> Experimental results demonstrate that RAP-SM effectively captures the intrinsic commonalities among different models.
arXiv Detail & Related papers (2025-05-08T03:21:58Z) - MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models [1.9249287163937978]
We propose a novel fingerprinting method, MergePrint, to embed robust fingerprints capable of surviving model merging.<n> MergePrint enables black-box ownership verification, where owners only need to check if a model produces target outputs for specific fingerprint inputs.
arXiv Detail & Related papers (2024-10-11T08:00:49Z) - ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks.<n> adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation.<n> Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z) - Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging [25.327483618051378]
We conduct the first study on the robustness of IP protection methods under model merging scenarios.
Experimental results indicate that current Large Language Model (LLM) watermarking techniques cannot survive in the merged models.
Our research aims to highlight that model merging should be an indispensable consideration in the robustness assessment of model IP protection techniques.
arXiv Detail & Related papers (2024-04-08T04:30:33Z) - Towards Scalable and Robust Model Versioning [30.249607205048125]
Malicious incursions aimed at gaining access to deep learning models are on the rise.
We show how to generate multiple versions of a model that possess different attack properties.
We show theoretically that this can be accomplished by incorporating parameterized hidden distributions into the model training data.
arXiv Detail & Related papers (2024-01-17T19:55:49Z) - Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large
Language Models [11.57282859281814]
We consider different knowledge levels and attribution strategies, and find that we can correctly trace back 8 out of the 10 fine tuned models with our best method.
arXiv Detail & Related papers (2023-06-15T17:42:48Z) - Are You Stealing My Model? Sample Correlation for Fingerprinting Deep
Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint.
We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC)
SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z) - MOVE: Effective and Harmless Ownership Verification via Embedded External Features [104.97541464349581]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.<n>We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.<n>We then train a meta-classifier to determine whether a model is stolen from the victim.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z) - Artificial Fingerprinting for Generative Models: Rooting Deepfake
Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs)
Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation.
We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.