CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks
- URL: http://arxiv.org/abs/2602.20419v1
- Date: Mon, 23 Feb 2026 23:36:25 GMT
- Title: CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks
- Authors: Bolin Shen, Zhan Cheng, Neil Zhenqiang Gong, Fan Yao, Yushun Dong,
- Abstract summary: We introduce CREDIT, a certified ownership verification against Model Extraction Attacks (MEAs)<n>We quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold.<n>We extensively evaluate our approach on several mainstream datasets across different domains and tasks, achieving state-of-the-art performance.
- Score: 54.04030169323115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine Learning as a Service (MLaaS) has emerged as a widely adopted paradigm for providing access to deep neural network (DNN) models, enabling users to conveniently leverage these models through standardized APIs. However, such services are highly vulnerable to Model Extraction Attacks (MEAs), where an adversary repeatedly queries a target model to collect input-output pairs and uses them to train a surrogate model that closely replicates its functionality. While numerous defense strategies have been proposed, verifying the ownership of a suspicious model with strict theoretical guarantees remains a challenging task. To address this gap, we introduce CREDIT, a certified ownership verification against MEAs. Specifically, we employ mutual information to quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold. We extensively evaluate our approach on several mainstream datasets across different domains and tasks, achieving state-of-the-art performance. Our implementation is publicly available at: https://github.com/LabRAI/CREDIT.
Related papers
- CITED: A Decision Boundary-Aware Signature for GNNs Towards Model Extraction Defense [17.36953954069976]
An emerging threat known as Model Extraction Attacks (MEAs) presents significant risks.<n>We propose a novel ownership verification framework CITED which is a first-of-its-kind method to achieve ownership verification on both embedding and label levels.
arXiv Detail & Related papers (2026-02-23T23:33:31Z) - Multi-Layer Confidence Scoring for Detection of Out-of-Distribution Samples, Adversarial Attacks, and In-Distribution Misclassifications [2.4219039094115034]
We introduce Multi-Layer Analysis for Confidence Scoring (MACS)<n>We derive a score applicable for confidence estimation, detecting distributional shifts and adversarial attacks.<n>We achieve performances that surpass the state-of-the-art approaches in our experiments with the VGG16 and ViTb16 models.
arXiv Detail & Related papers (2025-12-22T15:25:10Z) - MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z) - A2-DIDM: Privacy-preserving Accumulator-enabled Auditing for Distributed Identity of DNN Model [43.10692581757967]
We propose a novel Accumulator-enabled Auditing for Distributed Identity of DNN Model (A2-DIDM)
A2-DIDM uses blockchain and zero-knowledge techniques to protect data and function privacy while ensuring the lightweight on-chain ownership verification.
arXiv Detail & Related papers (2024-05-07T08:24:50Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Securing Graph Neural Networks in MLaaS: A Comprehensive Realization of Query-based Integrity Verification [68.86863899919358]
We introduce a groundbreaking approach to protect GNN models in Machine Learning from model-centric attacks.
Our approach includes a comprehensive verification schema for GNN's integrity, taking into account both transductive and inductive GNNs.
We propose a query-based verification technique, fortified with innovative node fingerprint generation algorithms.
arXiv Detail & Related papers (2023-12-13T03:17:05Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - Verifying Quantized Neural Networks using SMT-Based Model Checking [2.38142799291692]
We develop and evaluate a symbolic verification framework using incremental model checking (IMC) and satisfiability modulo theories (SMT)
We can provide guarantees on the safe behavior of ANNs implemented both in floating-point and fixed-point arithmetic.
For small- to medium-sized ANNs, our approach completes most of its verification runs in minutes.
arXiv Detail & Related papers (2021-06-10T18:27:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.