Unveiling the Unseen: Exploring Whitebox Membership Inference through the Lens of Explainability
- URL: http://arxiv.org/abs/2407.01306v1
- Date: Mon, 1 Jul 2024 14:07:46 GMT
- Title: Unveiling the Unseen: Exploring Whitebox Membership Inference through the Lens of Explainability
- Authors: Chenxi Li, Abhinav Kumar, Zhen Guo, Jie Hou, Reza Tourani,
- Abstract summary: We propose an attack-driven explainable framework to identify the most influential features of raw data that lead to successful membership inference attacks.
Our proposed MIA shows an improvement of up to 26% on state-of-the-art MIA.
- Score: 10.632831321114502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root causes of attacks based on raw data features. In this paper, we aim to address these knowledge gaps by first exploring statistical approaches to identify the most informative neurons and quantifying the significance of the hidden activations from the selected neurons on attack accuracy, in isolation and combination. Additionally, we propose an attack-driven explainable framework by integrating the target and attack models to identify the most influential features of raw data that lead to successful membership inference attacks. Our proposed MIA shows an improvement of up to 26% on state-of-the-art MIA.
Related papers
- Membership Inference Attacks Against In-Context Learning [26.57639819629732]
We present the first membership inference attack tailored for In-Context Learning (ICL)
We propose four attack strategies tailored to various constrained scenarios.
We investigate three potential defenses targeting data, instruction, and output.
arXiv Detail & Related papers (2024-09-02T17:23:23Z) - Celtibero: Robust Layered Aggregation for Federated Learning [0.0]
We introduce Celtibero, a novel defense mechanism that integrates layered aggregation to enhance robustness against adversarial manipulation.
We demonstrate that Celtibero consistently achieves high main task accuracy (MTA) while maintaining minimal attack success rates (ASR) across a range of untargeted and targeted poisoning attacks.
arXiv Detail & Related papers (2024-08-26T12:54:00Z) - GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation [4.332441337407564]
We explore a connection between the susceptibility to membership inference attacks and the vulnerability to distillation-based functionality stealing attacks.
We propose GLiRA, a distillation-guided approach to membership inference attack on the black-box neural network.
We evaluate the proposed method across multiple image classification datasets and models and demonstrate that likelihood ratio attacks when guided by the knowledge distillation, outperform the current state-of-the-art membership inference attacks in the black-box setting.
arXiv Detail & Related papers (2024-05-13T08:52:04Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Fundamental Limits of Membership Inference Attacks on Machine Learning Models [29.367087890055995]
Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive information about individuals.
This article provides theoretical guarantees by exploring the fundamental statistical limitations associated with MIAs on machine learning models.
arXiv Detail & Related papers (2023-10-20T19:32:54Z) - Investigating Human-Identifiable Features Hidden in Adversarial
Perturbations [54.39726653562144]
Our study explores up to five attack algorithms across three datasets.
We identify human-identifiable features in adversarial perturbations.
Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models.
arXiv Detail & Related papers (2023-09-28T22:31:29Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Where Did You Learn That From? Surprising Effectiveness of Membership
Inference Attacks Against Temporally Correlated Data in Deep Reinforcement
Learning [114.9857000195174]
A major challenge to widespread industrial adoption of deep reinforcement learning is the potential vulnerability to privacy breaches.
We propose an adversarial attack framework tailored for testing the vulnerability of deep reinforcement learning algorithms to membership inference attacks.
arXiv Detail & Related papers (2021-09-08T23:44:57Z) - Curse or Redemption? How Data Heterogeneity Affects the Robustness of
Federated Learning [51.15273664903583]
Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks.
This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks.
arXiv Detail & Related papers (2021-02-01T06:06:21Z) - Characterizing the Evasion Attackability of Multi-label Classifiers [37.00606062677375]
Evasion attack in multi-label learning systems is an interesting, widely witnessed, yet rarely explored research topic.
Characterizing the crucial factors determining the attackability of the multi-label adversarial threat is the key to interpret the origin of the vulnerability.
We propose an efficient empirical attackability estimator via greedy label space exploration.
arXiv Detail & Related papers (2020-12-17T07:34:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.