Related papers: Adaptive Domain Inference Attack with Concept Hierarchy

Adaptive Domain Inference Attack with Concept Hierarchy

URL: http://arxiv.org/abs/2312.15088v3
Date: Mon, 27 Jan 2025 03:35:40 GMT
Title: Adaptive Domain Inference Attack with Concept Hierarchy
Authors: Yuechun Gu, Jiajie He, Keke Chen,
Abstract summary: Most known model-targeted attacks assume attackers have learned the application domain or training data distribution.<n>Can removing the domain information from model APIs protect models from these attacks?<n>We show that the proposed adaptive domain inference attack (ADI) can still successfully estimate relevant subsets of training data.
Score: 4.772368796656325
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: With increasingly deployed deep neural networks in sensitive application domains, such as healthcare and security, it's essential to understand what kind of sensitive information can be inferred from these models. Most known model-targeted attacks assume attackers have learned the application domain or training data distribution to ensure successful attacks. Can removing the domain information from model APIs protect models from these attacks? This paper studies this critical problem. Unfortunately, even with minimal knowledge, i.e., accessing the model as an unnamed function without leaking the meaning of input and output, the proposed adaptive domain inference attack (ADI) can still successfully estimate relevant subsets of training data. We show that the extracted relevant data can significantly improve, for instance, the performance of model-inversion attacks. Specifically, the ADI method utilizes a concept hierarchy extracted from a collection of available public and private datasets and a novel algorithm to adaptively tune the likelihood of leaf concepts showing up in the unseen training data. We also designed a straightforward hypothesis-testing-based attack -- LDI. The ADI attack not only extracts partial training data at the concept level but also converges fastest and requires the fewest target-model accesses among all candidate methods. Our code is available at https://anonymous.4open.science/r/KDD-362D.

Related papers

Variance-Based Defense Against Blended Backdoor Attacks [0.0]
Backdoor attacks represent a subtle yet effective class of cyberattacks targeting AI models.<n>We propose a novel defense method that trains a model on the given dataset, detects poisoned classes, and extracts the critical part of the attack trigger.
arXiv Detail & Related papers (2025-06-02T09:01:35Z)
No Query, No Access [50.18709429731724]
We introduce the textbfVictim Data-based Adrial Attack (VDBA), which operates using only victim texts.<n>To prevent access to the victim model, we create a shadow dataset with publicly available pre-trained models and clustering methods.<n>Experiments on the Emotion and SST5 datasets show that VDBA outperforms state-of-the-art methods, achieving an ASR improvement of 52.08%.
arXiv Detail & Related papers (2025-05-12T06:19:59Z)
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks. FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain. We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z)
Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting. First, we regularize the training process of the attack model with an added semantic loss function. Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z)
A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots [14.998272283348152]
It is well known that an adversary can leverage a target ML model's output to steal the model's information. We propose a new side channel for model information stealing attacks, i.e., models' scientific plots.
arXiv Detail & Related papers (2023-02-23T12:57:34Z)
Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network [102.21368201494909]
Model inversion (MI) attacks have raised increasing concerns about privacy. Recent MI attacks leverage a generative adversarial network (GAN) as an image prior to narrow the search space. We propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN)
arXiv Detail & Related papers (2023-02-20T07:29:34Z)
GAN-based Domain Inference Attack [3.731168012111833]
We propose a generative adversarial network (GAN) based method to explore likely or similar domains of a target model. We find that the target model may distract the training procedure less if the domain is more similar to the target domain. Our experiments show that the auxiliary dataset from an MDI top-ranked domain can effectively boost the result of model-inversion attacks.
arXiv Detail & Related papers (2022-12-22T15:40:53Z)
Are Your Sensitive Attributes Private? Novel Model Inversion Attribute Inference Attacks on Classification Models [22.569705869469814]
We focus on model inversion attacks where the adversary knows non-sensitive attributes about records in the training data. We devise a novel confidence score-based model inversion attribute inference attack that significantly outperforms the state-of-the-art. We also extend our attacks to the scenario where some of the other (non-sensitive) attributes of a target record are unknown to the adversary.
arXiv Detail & Related papers (2022-01-23T21:27:20Z)
Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect. There always exists severe domain shift between the pretraining and downstream ABSA datasets. We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z)
Property Inference Attacks on Convolutional Neural Networks: Influence and Implications of Target Model's Complexity [1.2891210250935143]
Property Inference Attacks aim to infer from a given model properties about the training dataset seemingly unrelated to the model's primary goal. This paper investigates the influence of the target model's complexity on the accuracy of this type of attack. Our findings reveal that the risk of a privacy breach is present independently of the target model's complexity.
arXiv Detail & Related papers (2021-04-27T09:19:36Z)
Manipulating SGD with Data Ordering Attacks [23.639512087220137]
We present a class of training-time attacks that require no changes to the underlying model dataset or architecture. In particular, an attacker can disrupt the integrity and availability of a model by simply reordering training batches. Attacks have a long-term impact in that they decrease model performance hundreds of epochs after the attack took place.
arXiv Detail & Related papers (2021-04-19T22:17:27Z)
Distill and Fine-tune: Effective Adaptation from a Black-box Source Model [138.12678159620248]
Unsupervised domain adaptation (UDA) aims to transfer knowledge in previous related labeled datasets (source) to a new unlabeled dataset (target) We propose a novel two-step adaptation framework called Distill and Fine-tune (Dis-tune)
arXiv Detail & Related papers (2021-04-04T05:29:05Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer [137.36099660616975]
Unsupervised adaptation adaptation (UDA) aims to transfer knowledge from a related but different well-labeled source domain to a new unlabeled target domain. Most existing UDA methods require access to the source data, and thus are not applicable when the data are confidential and not shareable due to privacy concerns. This paper aims to tackle a realistic setting with only a classification model available trained over, instead of accessing to the source data.
arXiv Detail & Related papers (2020-12-14T07:28:50Z)
Practical No-box Adversarial Attacks against DNNs [31.808770437120536]
We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model. We propose three mechanisms for training with a very small dataset and find that prototypical reconstruction is the most effective. Our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.
arXiv Detail & Related papers (2020-12-04T11:10:03Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.