Related papers: AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers

AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers

URL: http://arxiv.org/abs/2508.05691v2
Date: Thu, 25 Sep 2025 10:41:41 GMT
Title: AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers
Authors: Kai Yao, Marc Juarez,
Abstract summary: This paper introduces a trusted verifier that extracts hidden fingerprints from the authentic model's output space and trains a detector to recognize them.<n>During verification, this detector can determine whether new outputs are consistent with the certified model, without requiring specialized hardware or model modifications.<n>In extensive experiments, our methods achieve near-zero FPR@95%TPR on both GANs and diffusion models.
Score: 5.450474861880874
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative models are increasingly adopted in high-stakes domains, yet current deployments offer no mechanisms to verify whether a given output truly originates from the certified model. We address this gap by extending model fingerprinting techniques beyond the traditional collaborative setting to one where the model provider itself may act adversarially, replacing the certified model with a cheaper or lower-quality substitute. To our knowledge, this is the first work to study fingerprinting for provenance attribution under such a threat model. Our approach introduces a trusted verifier that, during a certification phase, extracts hidden fingerprints from the authentic model's output space and trains a detector to recognize them. During verification, this detector can determine whether new outputs are consistent with the certified model, without requiring specialized hardware or model modifications. In extensive experiments, our methods achieve near-zero FPR@95%TPR on both GANs and diffusion models, and remain effective even against subtle architectural or training changes. Furthermore, the approach is robust to adaptive adversaries that actively manipulate outputs in an attempt to evade detection.

Related papers

CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks [54.04030169323115]
We introduce CREDIT, a certified ownership verification against Model Extraction Attacks (MEAs)<n>We quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold.<n>We extensively evaluate our approach on several mainstream datasets across different domains and tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2026-02-23T23:36:25Z)
A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors [43.11304710234668]
We introduce a novel fingerprinting framework that leverages the behavioral patterns induced by safety alignment.<n>In a large-scale identification task across 76 offspring models, our method achieves 100% accuracy in identifying the correct base model family.<n>We propose a theoretical framework to transform this private fingerprint into a publicly verifiable, privacy-preserving artifact.
arXiv Detail & Related papers (2026-02-10T05:57:35Z)
ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors [58.45131932883374]
We propose a fully self-supervised approach to detect deepfakes in videos.<n>Our model computes the identity distances between suspected videos and personalized subjects via diffusion reconstruction errors.<n>Our method is highly robust to corruptions such as blur and compression, highlighting the applicability in real-world face forgery detection.
arXiv Detail & Related papers (2026-01-05T18:59:54Z)
Are Robust LLM Fingerprints Adversarially Robust? [31.998822577243867]
We first define a concrete, practical threat model against model fingerprinting.<n>We then take a critical look at existing model fingerprinting schemes to identify their fundamental vulnerabilities.<n>Based on these, we develop adaptive adversarial attacks tailored for each vulnerability.
arXiv Detail & Related papers (2025-09-30T17:47:09Z)
Causal Fingerprints of AI Generative Models [18.85839181425287]
We argue that a complete model fingerprint should reflect the causality between image provenance and model traces.<n>We propose a causality-decoupling framework that disentangles it from image-specific content and style.<n>Our approach outperforms existing methods in model attribution, indicating strong potential for forgery detection, model copyright tracing, and identity protection.
arXiv Detail & Related papers (2025-09-18T20:33:27Z)
Deep Learning Models for Robust Facial Liveness Detection [56.08694048252482]
This study introduces a robust solution through novel deep learning models addressing the deficiencies in contemporary anti-spoofing techniques.<n>By innovatively integrating texture analysis and reflective properties associated with genuine human traits, our models distinguish authentic presence from replicas with remarkable precision.
arXiv Detail & Related papers (2025-08-12T17:19:20Z)
Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model! [1.8824463630667776]
Large language models (LLMs) face significant copyright and intellectual property challenges as the cost of training increases and model reuse becomes prevalent.<n>This work introduces a simple yet effective approach for robust fingerprinting based on intrinsic model characteristics.
arXiv Detail & Related papers (2025-07-02T12:29:38Z)
Learning Counterfactually Decoupled Attention for Open-World Model Attribution [75.52873383916672]
We propose a Counterfactually Decoupled Attention Learning (CDAL) method for open-world model attribution.<n>Our method consistently improves state-of-the-art models by large margins, particularly for unseen novel attacks.
arXiv Detail & Related papers (2025-06-29T03:25:45Z)
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint [22.398234847594242]
Model fingerprinting is a widely adopted approach to safeguard the intellectual property rights of open-source models.<n>In this paper, we reveal that they are vulnerable to false claim attacks where adversaries falsely assert ownership of any third-party model.<n>Motivated by these findings, we propose a targeted fingerprinting paradigm (i.e., FIT-Print) to counteract false claim attacks.
arXiv Detail & Related papers (2025-01-26T13:00:58Z)
Sample Correlation for Fingerprinting Deep Face Recognition [83.53005932513156]
We propose a novel model stealing detection method based on SA Corremplelation (SAC)<n>SAC successfully defends against various model stealing attacks in deep face recognition, encompassing face verification and face emotion recognition, exhibiting the highest performance in terms of AUC, p-value and F1 score.<n>We extend our evaluation of SAC-JC to object recognition including Tiny-ImageNet and CIFAR10, which also demonstrates the superior performance of SAC-JC to previous methods.
arXiv Detail & Related papers (2024-12-30T07:37:06Z)
MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models [1.9249287163937978]
We propose a novel fingerprinting method, MergePrint, to embed robust fingerprints capable of surviving model merging.<n> MergePrint enables black-box ownership verification, where owners only need to check if a model produces target outputs for specific fingerprint inputs.
arXiv Detail & Related papers (2024-10-11T08:00:49Z)
Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models [65.30406788716104]
This work investigates the vulnerabilities of security-enhancing diffusion models. We demonstrate that these models are highly susceptible to DIFF2, a simple yet effective backdoor attack. Case studies show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models.
arXiv Detail & Related papers (2024-06-14T02:39:43Z)
Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint. We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC) SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z)
CrowdGuard: Federated Backdoor Detection in Federated Learning [39.58317527488534]
This paper presents a novel defense mechanism, CrowdGuard, that effectively mitigates backdoor attacks in Federated Learning. CrowdGuard employs a server-located stacked clustering scheme to enhance its resilience to rogue client feedback. The evaluation results demonstrate that CrowdGuard achieves a 100% True-Positive-Rate and True-Negative-Rate across various scenarios.
arXiv Detail & Related papers (2022-10-14T11:27:49Z)
Fingerprinting Image-to-Image Generative Adversarial Networks [53.02510603622128]
Generative Adversarial Networks (GANs) have been widely used in various application scenarios. This paper presents a novel fingerprinting scheme for the Intellectual Property protection of image-to-image GANs based on a trusted third party.
arXiv Detail & Related papers (2021-06-19T06:25:10Z)
Responsible Disclosure of Generative Models Using Scalable Fingerprinting [70.81987741132451]
Deep generative models have achieved a qualitatively new level of performance. There are concerns on how this technology can be misused to spoof sensors, generate deep fakes, and enable misinformation at scale. Our work enables a responsible disclosure of such state-of-the-art generative models, that allows researchers and companies to fingerprint their models.
arXiv Detail & Related papers (2020-12-16T03:51:54Z)
Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs) Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation. We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.