Private Matrix Factorization with Public Item Features
- URL: http://arxiv.org/abs/2309.11516v1
- Date: Sun, 17 Sep 2023 11:13:52 GMT
- Title: Private Matrix Factorization with Public Item Features
- Authors: Mihaela Curmei, Walid Krichene, Li Zhang, Mukund Sundararajan
- Abstract summary: Training with Differential Privacy (DP) offers strong privacy guarantees, at the expense of loss in recommendation quality.
We show that incorporating public item features during training can help mitigate this loss in quality.
- Score: 14.547931725603888
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of training private recommendation models with access
to public item features. Training with Differential Privacy (DP) offers strong
privacy guarantees, at the expense of loss in recommendation quality. We show
that incorporating public item features during training can help mitigate this
loss in quality. We propose a general approach based on collective matrix
factorization (CMF), that works by simultaneously factorizing two matrices: the
user feedback matrix (representing sensitive data) and an item feature matrix
that encodes publicly available (non-sensitive) item information.
The method is conceptually simple, easy to tune, and highly scalable. It can
be applied to different types of public item data, including: (1) categorical
item features; (2) item-item similarities learned from public sources; and (3)
publicly available user feedback. Furthermore, these data modalities can be
collectively utilized to fully leverage public data.
Evaluating our method on a standard DP recommendation benchmark, we find that
using public item features significantly narrows the quality gap between
private models and their non-private counterparts. As privacy constraints
become more stringent, models rely more heavily on public side features for
recommendation. This results in a smooth transition from collaborative
filtering to item-based contextual recommendations.
Related papers
- Machine Learning with Privacy for Protected Attributes [56.44253915927481]
We refine the definition of differential privacy (DP) to create a more general and flexible framework that we call feature differential privacy (FDP)<n>Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary separation of protected and non-protected features.<n>We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available.
arXiv Detail & Related papers (2025-06-24T17:53:28Z) - Leveraging Vertical Public-Private Split for Improved Synthetic Data Generation [9.819636361032256]
Differentially Private Synthetic Data Generation is a key enabler of private and secure data sharing.
Recent literature has explored scenarios where a small amount of public data is used to help enhance the quality of synthetic data.
We propose a novel framework that adapts horizontal public-assisted methods into the vertical setting.
arXiv Detail & Related papers (2025-04-15T08:59:03Z) - Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.
We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - Personalized Federated Collaborative Filtering: A Variational AutoEncoder Approach [49.63614966954833]
Federated Collaborative Filtering (FedCF) is an emerging field focused on developing a new recommendation framework with preserving privacy.
Existing FedCF methods typically combine distributed Collaborative Filtering (CF) algorithms with privacy-preserving mechanisms, and then preserve personalized information into a user embedding vector.
This paper proposes a novel personalized FedCF method by preserving users' personalized information into a latent variable and a neural model simultaneously.
arXiv Detail & Related papers (2024-08-16T05:49:14Z) - Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms.
Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z) - Private Learning with Public Features [18.142859808011618]
We study a class of private learning problems in which the data is a join of private and public features.
We develop new algorithms that take advantage of this separation by only protecting certain sufficient statistics.
arXiv Detail & Related papers (2023-10-24T01:59:28Z) - Decentralized Matrix Factorization with Heterogeneous Differential
Privacy [2.4743508801114444]
We propose a novel Heterogeneous Differentially Private Matrix Factorization algorithm (denoted as HDPMF) for untrusted recommender.
Our framework uses modified stretching mechanism with an innovative rescaling scheme to achieve better trade off between privacy and accuracy.
arXiv Detail & Related papers (2022-12-01T06:48:18Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Privacy-preserving Non-negative Matrix Factorization with Outliers [2.84279467589473]
We focus on developing a Non-negative matrix factorization algorithm in the privacy-preserving framework.
We propose a novel privacy-preserving algorithm for non-negative matrix factorisation capable of operating on private data.
We show our proposed framework's performance in six real data sets.
arXiv Detail & Related papers (2022-11-02T19:42:18Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Post-processing of Differentially Private Data: A Fairness Perspective [53.29035917495491]
This paper shows that post-processing causes disparate impacts on individuals or groups.
It analyzes two critical settings: the release of differentially private datasets and the use of such private datasets for downstream decisions.
It proposes a novel post-processing mechanism that is (approximately) optimal under different fairness metrics.
arXiv Detail & Related papers (2022-01-24T02:45:03Z) - The Stereotyping Problem in Collaboratively Filtered Recommender Systems [77.56225819389773]
We show that matrix factorization-based collaborative filtering algorithms induce a kind of stereotyping.
If preferences for a textitset of items are anti-correlated in the general user population, then those items may not be recommended together to a user.
We propose an alternative modelling fix, which is designed to capture the diverse multiple interests of each user.
arXiv Detail & Related papers (2021-06-23T18:37:47Z) - Privacy Threats Against Federated Matrix Factorization [14.876668437269817]
We study the privacy threats of the matrix factorization method in the federated learning framework.
This is the first study of privacy threats of the matrix factorization method in the federated learning framework.
arXiv Detail & Related papers (2020-07-03T09:58:52Z) - Federating Recommendations Using Differentially Private Prototypes [16.29544153550663]
We propose a new federated approach to learning global and local private models for recommendation without collecting raw data.
By requiring only two rounds of communication, we both reduce the communication costs and avoid the excessive privacy loss.
We show local adaptation of the global model allows our method to outperform centralized matrix-factorization-based recommender system models.
arXiv Detail & Related papers (2020-03-01T22:21:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.