Information Leakage from Data Updates in Machine Learning Models
- URL: http://arxiv.org/abs/2309.11022v1
- Date: Wed, 20 Sep 2023 02:55:03 GMT
- Title: Information Leakage from Data Updates in Machine Learning Models
- Authors: Tian Hui, Farhad Farokhi, Olga Ohrimenko
- Abstract summary: We consider the setting where machine learning models are retrained on updated datasets in order to incorporate the most up-to-date information or reflect distribution shifts.
Here, the adversary has access to snapshots of the machine learning model before and after the change in the dataset occurs.
We propose attacks based on the difference in the prediction confidence of the original model and the updated model.
- Score: 12.337195143722342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we consider the setting where machine learning models are
retrained on updated datasets in order to incorporate the most up-to-date
information or reflect distribution shifts. We investigate whether one can
infer information about these updates in the training data (e.g., changes to
attribute values of records). Here, the adversary has access to snapshots of
the machine learning model before and after the change in the dataset occurs.
Contrary to the existing literature, we assume that an attribute of a single or
multiple training data points are changed rather than entire data records are
removed or added. We propose attacks based on the difference in the prediction
confidence of the original model and the updated model. We evaluate our attack
methods on two public datasets along with multi-layer perceptron and logistic
regression models. We validate that two snapshots of the model can result in
higher information leakage in comparison to having access to only the updated
model. Moreover, we observe that data records with rare values are more
vulnerable to attacks, which points to the disparate vulnerability of privacy
attacks in the update setting. When multiple records with the same original
attribute value are updated to the same new value (i.e., repeated changes), the
attacker is more likely to correctly guess the updated values since repeated
changes leave a larger footprint on the trained model. These observations point
to vulnerability of machine learning models to attribute inference attacks in
the update setting.
Related papers
- Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable [30.22146634953896]
We show how to mount a near-perfect attack on the deleted data point from linear regression models.
Our work highlights that privacy risk is significant even for extremely simple model classes when individuals can request deletion of their data from the model.
arXiv Detail & Related papers (2024-05-30T17:27:44Z) - JPAVE: A Generation and Classification-based Model for Joint Product
Attribute Prediction and Value Extraction [59.94977231327573]
We propose a multi-task learning model with value generation/classification and attribute prediction called JPAVE.
Two variants of our model are designed for open-world and closed-world scenarios.
Experimental results on a public dataset demonstrate the superiority of our model compared with strong baselines.
arXiv Detail & Related papers (2023-11-07T18:36:16Z) - Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
First, we regularize the training process of the attack model with an added semantic loss function.
Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z) - Client-specific Property Inference against Secure Aggregation in
Federated Learning [52.8564467292226]
Federated learning has become a widely used paradigm for collaboratively training a common model among different participants.
Many attacks have shown that it is still possible to infer sensitive information such as membership, property, or outright reconstruction of participant data.
We show that simple linear models can effectively capture client-specific properties only from the aggregated model updates.
arXiv Detail & Related papers (2023-03-07T14:11:01Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Revisiting the Updates of a Pre-trained Model for Few-shot Learning [11.871523410051527]
We compare the two popular updating methods, fine-tuning and linear probing.
We find that fine-tuning is better than linear probing as the number of samples increases.
arXiv Detail & Related papers (2022-05-13T08:47:06Z) - How to Combine Membership-Inference Attacks on Multiple Updated Models [14.281721121930035]
This paper proposes new attacks that take advantage of one or more model updates to improve membership inference (MI)
A key part of our approach is to leverage rich information from standalone MI attacks mounted separately against the original and updated models.
Our results on four public datasets show that our attacks are effective at using update information to give the adversary a significant advantage over attacks on standalone models.
arXiv Detail & Related papers (2022-05-12T21:14:11Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Manipulating SGD with Data Ordering Attacks [23.639512087220137]
We present a class of training-time attacks that require no changes to the underlying model dataset or architecture.
In particular, an attacker can disrupt the integrity and availability of a model by simply reordering training batches.
Attacks have a long-term impact in that they decrease model performance hundreds of epochs after the attack took place.
arXiv Detail & Related papers (2021-04-19T22:17:27Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.