Beyond Random Noise: Insights on Anonymization Strategies from a Latent
Bandit Study
- URL: http://arxiv.org/abs/2310.00221v1
- Date: Sat, 30 Sep 2023 01:56:04 GMT
- Title: Beyond Random Noise: Insights on Anonymization Strategies from a Latent
Bandit Study
- Authors: Alexander Galozy, Sadi Alawadi, Victor Kebande, S{\l}awomir Nowaczyk
- Abstract summary: This paper investigates the issue of privacy in a learning scenario where users share knowledge for a recommendation task.
We use the latent bandit setting to evaluate the trade-off between privacy and recommender performance.
- Score: 44.94720642208655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the issue of privacy in a learning scenario where
users share knowledge for a recommendation task. Our study contributes to the
growing body of research on privacy-preserving machine learning and underscores
the need for tailored privacy techniques that address specific attack patterns
rather than relying on one-size-fits-all solutions. We use the latent bandit
setting to evaluate the trade-off between privacy and recommender performance
by employing various aggregation strategies, such as averaging, nearest
neighbor, and clustering combined with noise injection. More specifically, we
simulate a linkage attack scenario leveraging publicly available auxiliary
information acquired by the adversary. Our results on three open real-world
datasets reveal that adding noise using the Laplace mechanism to an individual
user's data record is a poor choice. It provides the highest regret for any
noise level, relative to de-anonymization probability and the ADS metric.
Instead, one should combine noise with appropriate aggregation strategies. For
example, using averages from clusters of different sizes provides flexibility
not achievable by varying the amount of noise alone. Generally, no single
aggregation strategy can consistently achieve the optimum regret for a given
desired level of privacy.
Related papers
- Privacy-Preserving Dynamic Assortment Selection [4.399892832075127]
This paper presents a novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model.
Our approach integrates noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection.
arXiv Detail & Related papers (2024-10-29T19:28:01Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Evaluating the Impact of Local Differential Privacy on Utility Loss via
Influence Functions [11.504012974208466]
We demonstrate the ability of influence functions to offer insight into how a specific privacy parameter value will affect a model's test loss.
Our proposed method allows a data curator to select the privacy parameter best aligned with their allowed privacy-utility trade-off.
arXiv Detail & Related papers (2023-09-15T18:08:24Z) - Dynamic Privacy Allocation for Locally Differentially Private Federated
Learning with Composite Objectives [10.528569272279999]
This paper proposes a differentially private federated learning algorithm for strongly convex but possibly nonsmooth problems.
The proposed algorithm adds artificial noise to the shared information to ensure privacy and dynamically allocates the time-varying noise variance to minimize an upper bound of the optimization error.
Numerical results show the superiority of the proposed algorithm over state-of-the-art methods.
arXiv Detail & Related papers (2023-08-02T13:30:33Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Enabling Trade-offs in Privacy and Utility in Genomic Data Beacons and
Summary Statistics [26.99521354120141]
We introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy.
In the first, an attacker applies a likelihood-ratio test to make membership-inference claims.
In the second, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals.
arXiv Detail & Related papers (2023-01-11T19:16:13Z) - Learning with Group Noise [106.56780716961732]
We propose a novel Max-Matching method for learning with group noise.
The performance on arange of real-world datasets in the area of several learning paradigms demonstrates the effectiveness of Max-Matching.
arXiv Detail & Related papers (2021-03-17T06:57:10Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z) - Privacy-Preserving Public Release of Datasets for Support Vector Machine
Classification [14.095523601311374]
We consider the problem of publicly releasing a dataset for support vector machine classification while not infringing on the privacy of data subjects.
The dataset is systematically obfuscated using an additive noise for privacy protection.
Conditions are established for ensuring that the classifier extracted from the original dataset and the obfuscated one are close to each other.
arXiv Detail & Related papers (2019-12-29T03:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.