Accurate, Explainable, and Private Models: Providing Recourse While
Minimizing Training Data Leakage
- URL: http://arxiv.org/abs/2308.04341v1
- Date: Tue, 8 Aug 2023 15:38:55 GMT
- Title: Accurate, Explainable, and Private Models: Providing Recourse While
Minimizing Training Data Leakage
- Authors: Catherine Huang, Chelse Swoopes, Christina Xiao, Jiaqi Ma, Himabindu
Lakkaraju
- Abstract summary: We present two novel methods to generate differentially private recourse.
We find that DPM and LR perform well in reducing what an adversary can infer.
- Score: 10.921553888358375
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models are increasingly utilized across impactful domains to
predict individual outcomes. As such, many models provide algorithmic recourse
to individuals who receive negative outcomes. However, recourse can be
leveraged by adversaries to disclose private information. This work presents
the first attempt at mitigating such attacks. We present two novel methods to
generate differentially private recourse: Differentially Private Model (DPM)
and Laplace Recourse (LR). Using logistic regression classifiers and real world
and synthetic datasets, we find that DPM and LR perform well in reducing what
an adversary can infer, especially at low FPR. When training dataset size is
large enough, we find particular success in preventing privacy leakage while
maintaining model and recourse accuracy with our novel LR method.
Related papers
- Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs [44.8238758047607]
Current unlearning methods for LLMs optimize on the private information they seek to remove by incorporating it into their training objectives.<n>We argue this not only risks reinforcing exposure to sensitive data, it also contradicts the principle of minimizing its use.<n>We propose a novel unlearning method - Partial Model Collapse (PMC), which does not require unlearning targets in the unlearning objective.
arXiv Detail & Related papers (2025-07-06T03:08:49Z) - ARMOR: Shielding Unlearnable Examples against Data Augmentation [25.289775916629505]
We propose a framework, dubbed ARMOR, to protect data privacy from potential breaches of data augmentation.
ARMOR reduces the test accuracy of the model trained on augmented protected samples by as much as 60% more than baselines.
arXiv Detail & Related papers (2025-01-15T15:22:57Z) - Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage [12.892449128678516]
Fine-tuning language models on private data for downstream applications poses significant privacy risks.
Several popular community platforms now offer convenient distribution of a large variety of pre-trained models.
We introduce a novel poisoning technique that uses model-unlearning as an attack tool.
arXiv Detail & Related papers (2024-08-30T15:35:09Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks [28.502489028888608]
Unlearnable examples (ULEs) aim to protect data from unauthorized usage for training DNNs.
In adversarial training, the unlearnability of error-minimizing noise will severely degrade.
We propose a novel model-free method, named emphOne-Pixel Shortcut, which only perturbs a single pixel of each image and makes the dataset unlearnable.
arXiv Detail & Related papers (2022-05-24T15:17:52Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.