Sample-Efficient Personalization: Modeling User Parameters as Low Rank
Plus Sparse Components
- URL: http://arxiv.org/abs/2210.03505v3
- Date: Wed, 6 Sep 2023 00:35:30 GMT
- Title: Sample-Efficient Personalization: Modeling User Parameters as Low Rank
Plus Sparse Components
- Authors: Soumyabrata Pal, Prateek Varshney, Prateek Jain, Abhradeep Guha
Thakurta, Gagan Madan, Gaurav Aggarwal, Pradeep Shenoy and Gaurav Srivastava
- Abstract summary: Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems.
We propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse components.
We show that AMHT-LRS solves the problem efficiently with nearly optimal sample complexity.
- Score: 30.32486162748558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personalization of machine learning (ML) predictions for individual
users/domains/enterprises is critical for practical recommendation systems.
Standard personalization approaches involve learning a user/domain specific
embedding that is fed into a fixed global model which can be limiting. On the
other hand, personalizing/fine-tuning model itself for each user/domain --
a.k.a meta-learning -- has high storage/infrastructure cost. Moreover, rigorous
theoretical studies of scalable personalization approaches have been very
limited. To address the above issues, we propose a novel meta-learning style
approach that models network weights as a sum of low-rank and sparse
components. This captures common information from multiple individuals/users
together in the low-rank part while sparse part captures user-specific
idiosyncrasies. We then study the framework in the linear setting, where the
problem reduces to that of estimating the sum of a rank-$r$ and a $k$-column
sparse matrix using a small number of linear measurements. We propose a
computationally efficient alternating minimization method with iterative hard
thresholding -- AMHT-LRS -- to learn the low-rank and sparse part.
Theoretically, for the realizable Gaussian data setting, we show that AMHT-LRS
solves the problem efficiently with nearly optimal sample complexity. Finally,
a significant challenge in personalization is ensuring privacy of each user's
sensitive data. We alleviate this problem by proposing a differentially private
variant of our method that also is equipped with strong generalization
guarantees.
Related papers
- Improved Diversity-Promoting Collaborative Metric Learning for Recommendation [127.08043409083687]
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems.
This paper focuses on a challenging scenario where a user has multiple categories of interests.
We propose a novel method called textitDiversity-Promoting Collaborative Metric Learning (DPCML)
arXiv Detail & Related papers (2024-09-02T07:44:48Z) - MAP: Model Aggregation and Personalization in Federated Learning with Incomplete Classes [49.22075916259368]
In some real-world applications, data samples are usually distributed on local devices.
In this paper, we focus on a special kind of Non-I.I.D. scene where clients own incomplete classes.
Our proposed algorithm named MAP could simultaneously achieve the aggregation and personalization goals in FL.
arXiv Detail & Related papers (2024-04-14T12:22:42Z) - Distributed Personalized Empirical Risk Minimization [19.087524494290676]
This paper advocates a new paradigm Personalized Empirical Risk Minimization (PERM) to facilitate learning from heterogeneous data sources.
We propose a distributed algorithm that replaces the standard model averaging with model shuffling to simultaneously optimize PERM objectives for all devices.
arXiv Detail & Related papers (2023-10-26T20:07:33Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - The Minority Matters: A Diversity-Promoting Collaborative Metric
Learning Algorithm [154.47590401735323]
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems.
This paper focuses on a challenging scenario where a user has multiple categories of interests.
We propose a novel method called textitDiversity-Promoting Collaborative Metric Learning (DPCML)
arXiv Detail & Related papers (2022-09-30T08:02:18Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives.
A ranked range is a consecutive sequence of sorted values of a set of real numbers.
We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z) - Personalized Federated Learning: A Meta-Learning Approach [28.281166755509886]
In Federated Learning, we aim to train models across multiple computing units (users)
In this paper, we study a personalized variant of the federated learning in which our goal is to find an initial shared model that current or new users can easily adapt to their local dataset by performing one or a few steps of gradient descent with respect to their own data.
arXiv Detail & Related papers (2020-02-19T01:08:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.