Revisiting Membership Inference Under Realistic Assumptions
- URL: http://arxiv.org/abs/2005.10881v5
- Date: Wed, 13 Jan 2021 20:44:44 GMT
- Title: Revisiting Membership Inference Under Realistic Assumptions
- Authors: Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer, Quanquan Gu,
David Evans
- Abstract summary: We study membership inference in settings where some of the assumptions typically used in previous research are relaxed.
This setting is more realistic than the balanced prior setting typically considered by researchers.
We develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function.
- Score: 87.13552321332988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study membership inference in settings where some of the assumptions
typically used in previous research are relaxed. First, we consider skewed
priors, to cover cases such as when only a small fraction of the candidate pool
targeted by the adversary are actually members and develop a PPV-based metric
suitable for this setting. This setting is more realistic than the balanced
prior setting typically considered by researchers. Second, we consider
adversaries that select inference thresholds according to their attack goals
and develop a threshold selection procedure that improves inference attacks.
Since previous inference attacks fail in imbalanced prior setting, we develop a
new inference attack based on the intuition that inputs corresponding to
training set members will be near a local minimum in the loss function, and
show that an attack that combines this with thresholds on the per-instance loss
can achieve high PPV even in settings where other attacks appear to be
ineffective. Code for our experiments can be found here:
https://github.com/bargavj/EvaluatingDPML.
Related papers
- Minimax rates of convergence for nonparametric regression under adversarial attacks [3.244945627960733]
We theoretically analyse the limits of robustness against adversarial attacks in a nonparametric regression setting.
Our work reveals that the minimax rate under adversarial attacks in the input is the same as sum of two terms.
arXiv Detail & Related papers (2024-10-12T07:11:38Z) - AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks [90.52808174102157]
In safety-critical applications such as medical imaging and autonomous driving, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks.
A notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models.
This study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks.
arXiv Detail & Related papers (2024-05-14T18:05:19Z) - AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples [26.37278338032268]
Adrial examples are typically optimized with gradient-based attacks.
Each is shown to outperform its predecessors using different experimental setups.
This provides overly-optimistic and even biased evaluations.
arXiv Detail & Related papers (2024-04-30T11:19:05Z) - Membership inference attack with relative decision boundary distance [9.764492069791991]
Membership inference attack is one of the most popular privacy attacks in machine learning.
We propose a new attack method, called muti-class adaptive membership inference attack in the label-only setting.
arXiv Detail & Related papers (2023-06-07T02:29:58Z) - Implicit Poisoning Attacks in Two-Agent Reinforcement Learning:
Adversarial Policies for Training-Time Attacks [21.97069271045167]
In targeted poisoning attacks, an attacker manipulates an agent-environment interaction to force the agent into adopting a policy of interest, called target policy.
We study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer.
We develop an optimization framework for designing optimal attacks, where the cost of the attack measures how much the solution deviates from the assumed default policy of the peer agent.
arXiv Detail & Related papers (2023-02-27T14:52:15Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Zero-Query Transfer Attacks on Context-Aware Object Detectors [95.18656036716972]
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results.
A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check.
We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check.
arXiv Detail & Related papers (2022-03-29T04:33:06Z) - Membership Inference Attacks From First Principles [24.10746844866869]
A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model's training dataset.
These attacks are currently evaluated using average-case "accuracy" metrics that fail to characterize whether the attack can confidently identify any members of the training set.
We argue that attacks should instead be evaluated by computing their true-positive rate at low false-positive rates, and find most prior attacks perform poorly when evaluated in this way.
Our attack is 10x more powerful at low false-positive rates, and also strictly dominates prior attacks on existing metrics.
arXiv Detail & Related papers (2021-12-07T08:47:00Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.