Related papers: Dissecting Distribution Inference

Dissecting Distribution Inference

URL: http://arxiv.org/abs/2212.07591v2
Date: Fri, 5 Apr 2024 18:43:10 GMT
Title: Dissecting Distribution Inference
Authors: Anshuman Suri, Yifu Lu, Yanjin Chen, David Evans,
Abstract summary: A distribution inference attack aims to infer statistical properties of data used to train machine learning models. We develop a new black-box attack that even outperforms the best known white-box attack in most settings. We evaluate the effectiveness of previously proposed defenses and introduce new defenses.
Score: 8.14277881525535
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly black-box threat scenarios. To improve understanding of distribution inference risks, we develop a new black-box attack that even outperforms the best known white-box attack in most settings. Using this new attack, we evaluate distribution inference risk while relaxing a variety of assumptions about the adversary's knowledge under black-box access, like known model architectures and label-only access. Finally, we evaluate the effectiveness of previously proposed defenses and introduce new defenses. We find that although noise-based defenses appear to be ineffective, a simple re-sampling defense can be highly effective. Code is available at https://github.com/iamgroot42/dissecting_distribution_inference

Related papers

Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis [3.795071937009966]
Adrial attacks can jeopardize the integrity of Machine Learning (ML) models. We propose a framework that detects if an adversarial noise instance is being generated. We evaluate our approach against 8 state-of-the-art attacks, including adaptive attacks.
arXiv Detail & Related papers (2025-03-04T20:25:12Z)
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks. FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain. We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z)
DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data. Recent attack methods can achieve a relatively high attack success rate (ASR) We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z)
Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify. We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time. Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z)
Post-train Black-box Defense via Bayesian Boundary Correction [9.769249984262958]
We propose a new post-train black-box defense framework for deep neural networks. It can turn any pre-trained classifier into a resilient one with little knowledge of the model specifics. It is further equipped with a new post-train strategy keeps the victim intact, avoiding re-training.
arXiv Detail & Related papers (2023-06-29T14:33:20Z)
Adversarial Attacks Neutralization via Data Set Randomization [3.655021726150369]
Adversarial attacks on deep learning models pose a serious threat to their reliability and security. We propose a new defense mechanism that is rooted on hyperspace projection. We show that our solution increases the robustness of deep learning models against adversarial attacks.
arXiv Detail & Related papers (2023-06-21T10:17:55Z)
Formalizing and Estimating Distribution Inference Risks [11.650381752104298]
We propose a formal and general definition of property inference attacks. Our results show that inexpensive attacks are as effective as expensive meta-classifier attacks. We extend the state-of-the-art property inference attack to work on convolutional neural networks.
arXiv Detail & Related papers (2021-09-13T14:54:39Z)
Theoretical Study of Random Noise Defense against Query-Based Black-Box Attacks [72.8152874114382]
In this work, we study a simple but promising defense technique, dubbed Random Noise Defense (RND) against query-based black-box attacks. It is lightweight and can be directly combined with any off-the-shelf models and other defense strategies. In this work, we present solid theoretical analyses to demonstrate that the defense effect of RND against the query-based black-box attack and the corresponding adaptive attack heavily depends on the magnitude ratio.
arXiv Detail & Related papers (2021-04-23T08:39:41Z)
Provable Defense Against Delusive Poisoning [64.69220849669948]
We show that adversarial training can be a principled defense method against delusive poisoning. This implies that adversarial training can be a principled defense method against delusive poisoning.
arXiv Detail & Related papers (2021-02-09T09:19:47Z)
Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios. We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z)
Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks [70.11599738647963]
Adversarial Training is one of the few defenses that withstand strong attacks. Traditional defense mechanisms assume a uniform attack over the examples according to the underlying data distribution. We present a weighted minimax risk optimization that defends against non-uniform attacks.
arXiv Detail & Related papers (2020-10-24T21:20:35Z)
Subpopulation Data Poisoning Attacks [18.830579299974072]
Poisoning attacks against machine learning induce adversarial modification of data used by a machine learning algorithm to selectively change its output when it is deployed. We introduce a novel data poisoning attack called a emphsubpopulation attack, which is particularly relevant when datasets are large and diverse. We design a modular framework for subpopulation attacks, instantiate it with different building blocks, and show that the attacks are effective for a variety of datasets and machine learning models.
arXiv Detail & Related papers (2020-06-24T20:20:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.