Related papers: Unlearning Protected User Attributes in Recommendations with Adversarial Training

Unlearning Protected User Attributes in Recommendations with Adversarial Training

URL: http://arxiv.org/abs/2206.04500v1
Date: Thu, 9 Jun 2022 13:36:28 GMT
Title: Unlearning Protected User Attributes in Recommendations with Adversarial Training
Authors: Christian Ganh\"or, David Penz, Navid Rekabsaz, Oleg Lesota, Markus Schedl
Abstract summary: Collaborative filtering algorithms capture underlying consumption patterns, including the ones specific to particular demographics or protected information of users. These encoded biases can influence the decision of a recommendation system towards further separation of the contents provided to various demographic subgroups. In this work, we investigate the possibility and challenges of removing specific protected information of users from the learned interaction representations of a RS algorithm.
Score: 10.268369743620159
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Collaborative filtering algorithms capture underlying consumption patterns, including the ones specific to particular demographics or protected information of users, e.g. gender, race, and location. These encoded biases can influence the decision of a recommendation system (RS) towards further separation of the contents provided to various demographic subgroups, and raise privacy concerns regarding the disclosure of users' protected attributes. In this work, we investigate the possibility and challenges of removing specific protected information of users from the learned interaction representations of a RS algorithm, while maintaining its effectiveness. Specifically, we incorporate adversarial training into the state-of-the-art MultVAE architecture, resulting in a novel model, Adversarial Variational Auto-Encoder with Multinomial Likelihood (Adv-MultVAE), which aims at removing the implicit information of protected attributes while preserving recommendation performance. We conduct experiments on the MovieLens-1M and LFM-2b-DemoBias datasets, and evaluate the effectiveness of the bias mitigation method based on the inability of external attackers in revealing the users' gender information from the model. Comparing with baseline MultVAE, the results show that Adv-MultVAE, with marginal deterioration in performance (w.r.t. NDCG and recall), largely mitigates inherent biases in the model on both datasets.

Related papers

PASS: Private Attributes Protection with Stochastic Data Substitution [46.38957234350463]
Various studies have been proposed to protect private attributes by removing them from the data while maintaining the utilities of the data for downstream tasks.<n> PASS is designed to substitute the original sample with another one according to certain probabilities, which is trained with a novel loss function.<n>The comprehensive evaluation of PASS on various datasets of different modalities, including facial images, human activity sensory signals, and voice recording datasets, substantiates PASS's effectiveness and generalizability.
arXiv Detail & Related papers (2025-06-08T22:48:07Z)
Simultaneous Unlearning of Multiple Protected User Attributes From Variational Autoencoder Recommenders Using Adversarial Training [8.272412404173954]
We present AdvXMultVAE which aims to unlearn multiple protected attributes simultaneously to improve fairness across demographic user groups. Our experiments on two datasets, LFM-2b-100k and Ml-1m, show that our approach can yield better results than its singular removal counterparts.
arXiv Detail & Related papers (2024-10-28T12:36:00Z)
Analyzing Inference Privacy Risks Through Gradients in Machine Learning [17.2657358645072]
We present a unified game-based framework that encompasses a broad range of attacks including attribute, property, distributional, and user disclosures. Our results demonstrate the inefficacy of solely relying on data aggregation to achieve privacy against inference attacks in distributed learning.
arXiv Detail & Related papers (2024-08-29T21:21:53Z)
InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling [66.3072381478251]
Reward hacking, also termed reward overoptimization, remains a critical challenge. We propose a framework for reward modeling, namely InfoRM, by introducing a variational information bottleneck objective. We show that InfoRM's overoptimization detection mechanism is not only effective but also robust across a broad range of datasets.
arXiv Detail & Related papers (2024-02-14T17:49:07Z)
Making Users Indistinguishable: Attribute-wise Unlearning in Recommender Systems [28.566330708233824]
We find that attackers can extract private information, i.e., gender, race, and age, from a trained model even if it has not been explicitly encountered during training. To protect the sensitive attribute of users, Attribute Unlearning (AU) aims to degrade attacking performance and make target attributes indistinguishable.
arXiv Detail & Related papers (2023-10-06T09:36:44Z)
Ex-Ante Assessment of Discrimination in Dataset [20.574371560492494]
Data owners face increasing liability for how the use of their data could harm under-priviliged communities. We propose FORESEE, a FORESt of decision trEEs algorithm, which generates a score that captures how likely an individual's response varies with sensitive attributes.
arXiv Detail & Related papers (2022-08-16T19:28:22Z)
Debiasing Learning for Membership Inference Attacks Against Recommender Systems [79.48353547307887]
Learned recommender systems may inadvertently leak information about their training data, leading to privacy violations. We investigate privacy threats faced by recommender systems through the lens of membership inference. We propose a Debiasing Learning for Membership Inference Attacks against recommender systems (DL-MIA) framework that has four main components.
arXiv Detail & Related papers (2022-06-24T17:57:34Z)
Learning Fair Representations via Rate-Distortion Maximization [16.985698188471016]
We present Fairness-aware Rate Maximization (FaRM), that removes demographic information by making representations of instances belonging to the same protected attribute class uncorrelated using the rate-distortion function. FaRM achieves state-of-the-art performance on several datasets, and learned representations leak significantly less protected attribute information against an attack by a non-linear probing network.
arXiv Detail & Related papers (2022-01-31T19:00:52Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models. We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups. We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection. However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information. We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.