Scalable Fingerprinting of Large Language Models
- URL: http://arxiv.org/abs/2502.07760v1
- Date: Tue, 11 Feb 2025 18:43:07 GMT
- Title: Scalable Fingerprinting of Large Language Models
- Authors: Anshul Nasery, Jonathan Hayase, Creston Brooks, Peiyao Sheng, Himanshu Tyagi, Pramod Viswanath, Sewoong Oh,
- Abstract summary: We introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints.
We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model without degrading the model's utility.
- Score: 46.26999419117367
- License:
- Abstract: Model fingerprinting has emerged as a powerful tool for model owners to identify their shared model given API access. However, to lower false discovery rate, fight fingerprint leakage, and defend against coalitions of model users attempting to bypass detection, we argue that {\em scalability} is critical, i.e., scaling up the number of fingerprints one can embed into a model. Hence, we pose scalability as a crucial requirement for fingerprinting schemes. We experiment with fingerprint design at a scale significantly larger than previously considered, and introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints. We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model -- two orders of magnitude more than existing schemes -- without degrading the model's utility. Our inserted fingerprints persist even after supervised fine-tuning on standard post-training data. We further address security risks for fingerprinting, and theoretically and empirically show how a scalable fingerprinting scheme like ours can mitigate these risks.
Related papers
- Sample Correlation for Fingerprinting Deep Face Recognition [83.53005932513156]
We propose a novel model stealing detection method based on SA Corremplelation (SAC)
SAC successfully defends against various model stealing attacks in deep face recognition, encompassing face verification and face emotion recognition, exhibiting the highest performance in terms of AUC, p-value and F1 score.
We extend our evaluation of SAC-JC to object recognition including Tiny-ImageNet and CIFAR10, which also demonstrates the superior performance of SAC-JC to previous methods.
arXiv Detail & Related papers (2024-12-30T07:37:06Z) - Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique [2.7174461714624805]
Chain & Hash is a new, simple fingerprinting approach that implements a fingerprint with a cryptographic flavor.
We evaluate the Chain & Hash technique on multiple models and demonstrate its robustness against benign transformations.
arXiv Detail & Related papers (2024-07-15T16:38:56Z) - Hierarchical Perceptual Noise Injection for Social Media Fingerprint
Privacy Protection [106.5308793283895]
fingerprint leakage from social media raises a strong desire for anonymizing shared images.
To guard the fingerprint leakage, adversarial attack emerges as a solution by adding imperceptible perturbations on images.
We propose FingerSafe, a hierarchical perceptual protective noise injection framework to address the mentioned problems.
arXiv Detail & Related papers (2022-08-23T02:20:46Z) - FBI: Fingerprinting models with Benign Inputs [17.323638042215013]
This paper tackles the challenges to propose i) fingerprinting schemes that are resilient to significant modifications of the models, by generalizing to the notion of model families and their variants.
We achieve both goals by demonstrating that benign inputs, that are unmodified images, are sufficient material for both tasks.
Both approaches are experimentally validated over an unprecedented set of more than 1,000 networks.
arXiv Detail & Related papers (2022-08-05T13:55:36Z) - Pair-Relationship Modeling for Latent Fingerprint Recognition [25.435974669629374]
We propose a new scheme that can model the pair-relationship of two fingerprints directly as the similarity feature for recognition.
Experimental results on two databases show that the proposed method outperforms the state of the art.
arXiv Detail & Related papers (2022-07-02T11:31:31Z) - FingerGAN: A Constrained Fingerprint Generation Scheme for Latent
Fingerprint Enhancement [23.67808389519383]
We propose a new method that formulates the latent fingerprint enhancement as a constrained fingerprint generation problem.
Experimental results on two public latent fingerprint databases demonstrate that our method outperforms the state of the arts significantly.
arXiv Detail & Related papers (2022-06-26T14:05:21Z) - Responsible Disclosure of Generative Models Using Scalable
Fingerprinting [70.81987741132451]
Deep generative models have achieved a qualitatively new level of performance.
There are concerns on how this technology can be misused to spoof sensors, generate deep fakes, and enable misinformation at scale.
Our work enables a responsible disclosure of such state-of-the-art generative models, that allows researchers and companies to fingerprint their models.
arXiv Detail & Related papers (2020-12-16T03:51:54Z) - Artificial Fingerprinting for Generative Models: Rooting Deepfake
Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs)
Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation.
We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z) - Latent Fingerprint Registration via Matching Densely Sampled Points [100.53031290339483]
Existing latent fingerprint registration approaches are mainly based on establishing correspondences between minutiae.
We propose a non-minutia latent fingerprint registration method which estimates the spatial transformation between a pair of fingerprints.
The proposed method achieves the state-of-the-art registration performance, especially under challenging conditions.
arXiv Detail & Related papers (2020-05-12T15:51:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.