DualCF: Efficient Model Extraction Attack from Counterfactual
Explanations
- URL: http://arxiv.org/abs/2205.06504v1
- Date: Fri, 13 May 2022 08:24:43 GMT
- Title: DualCF: Efficient Model Extraction Attack from Counterfactual
Explanations
- Authors: Yongjie Wang, Hangwei Qian, Chunyan Miao
- Abstract summary: Cloud service providers have launched Machine-Learning-as-a-Service platforms to allow users to access large-scale cloudbased models via APIs.
Such extra information inevitably causes the cloud models to be more vulnerable to extraction attacks.
We propose a novel simple yet efficient querying strategy to greatly enhance the querying efficiency to steal a classification model.
- Score: 57.46134660974256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cloud service providers have launched Machine-Learning-as-a-Service (MLaaS)
platforms to allow users to access large-scale cloudbased models via APIs. In
addition to prediction outputs, these APIs can also provide other information
in a more human-understandable way, such as counterfactual explanations (CF).
However, such extra information inevitably causes the cloud models to be more
vulnerable to extraction attacks which aim to steal the internal functionality
of models in the cloud. Due to the black-box nature of cloud models, however, a
vast number of queries are inevitably required by existing attack strategies
before the substitute model achieves high fidelity. In this paper, we propose a
novel simple yet efficient querying strategy to greatly enhance the querying
efficiency to steal a classification model. This is motivated by our
observation that current querying strategies suffer from decision boundary
shift issue induced by taking far-distant queries and close-to-boundary CFs
into substitute model training. We then propose DualCF strategy to circumvent
the above issues, which is achieved by taking not only CF but also
counterfactual explanation of CF (CCF) as pairs of training samples for the
substitute model. Extensive and comprehensive experimental evaluations are
conducted on both synthetic and real-world datasets. The experimental results
favorably illustrate that DualCF can produce a high-fidelity model with fewer
queries efficiently and effectively.
Related papers
- Revisiting Catastrophic Forgetting in Large Language Model Tuning [79.70722658190097]
Catastrophic Forgetting (CF) means models forgetting previously acquired knowledge when learning new data.
This paper takes the first step to reveal the direct link between the flatness of the model loss landscape and the extent of CF in the field of large language models.
Experiments on three widely-used fine-tuning datasets, spanning different model scales, demonstrate the effectiveness of our method in alleviating CF.
arXiv Detail & Related papers (2024-06-07T11:09:13Z) - MisGUIDE : Defense Against Data-Free Deep Learning Model Extraction [0.8437187555622164]
"MisGUIDE" is a two-step defense framework for Deep Learning models that disrupts the adversarial sample generation process.
The aim of the proposed defense method is to reduce the accuracy of the cloned model while maintaining accuracy on authentic queries.
arXiv Detail & Related papers (2024-03-27T13:59:21Z) - Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation [56.79064699832383]
We establish a Cloud-Edge Elastic Model Adaptation (CEMA) paradigm in which the edge models only need to perform forward propagation.
In our CEMA, to reduce the communication burden, we devise two criteria to exclude unnecessary samples from uploading to the cloud.
arXiv Detail & Related papers (2024-02-27T08:47:19Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - Adversarial Collaborative Filtering for Free [27.949683060138064]
Collaborative Filtering (CF) has been successfully used to help users discover the items of interest.
Existing methods suffer from noisy data issue, which negatively impacts the quality of recommendation.
We present Sharpness-aware Collaborative Filtering (CF), a simple yet effective method that conducts adversarial training without extra computational cost over the base.
arXiv Detail & Related papers (2023-08-20T19:25:38Z) - ReLACE: Reinforcement Learning Agent for Counterfactual Explanations of
Arbitrary Predictive Models [6.939617874336667]
We introduce a model-agnostic algorithm to generate optimal counterfactual explanations.
Our method is easily applied to any black-box model, as this resembles the environment that the DRL agent interacts with.
In addition, we develop an algorithm to extract explainable decision rules from the DRL agent's policy, so as to make the process of generating CFs itself transparent.
arXiv Detail & Related papers (2021-10-22T17:08:49Z) - Efficient Data-specific Model Search for Collaborative Filtering [56.60519991956558]
Collaborative filtering (CF) is a fundamental approach for recommender systems.
In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model.
Key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction and prediction function.
arXiv Detail & Related papers (2021-06-14T14:30:32Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.