Pre-trained Recommender Systems: A Causal Debiasing Perspective
- URL: http://arxiv.org/abs/2310.19251v4
- Date: Mon, 8 Jan 2024 17:38:33 GMT
- Title: Pre-trained Recommender Systems: A Causal Debiasing Perspective
- Authors: Ziqian Lin, Hao Ding, Nghia Trong Hoang, Branislav Kveton, Anoop
Deoras, Hao Wang
- Abstract summary: We develop a generic recommender that captures universal interaction patterns by training on generic user-item interaction data extracted from different domains.
Our empirical studies show that the proposed model could significantly improve the recommendation performance in zero- and few-shot learning settings.
- Score: 19.712997823535066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies on pre-trained vision/language models have demonstrated the
practical benefit of a new, promising solution-building paradigm in AI where
models can be pre-trained on broad data describing a generic task space and
then adapted successfully to solve a wide range of downstream tasks, even when
training data is severely limited (e.g., in zero- or few-shot learning
scenarios). Inspired by such progress, we investigate in this paper the
possibilities and challenges of adapting such a paradigm to the context of
recommender systems, which is less investigated from the perspective of
pre-trained model. In particular, we propose to develop a generic recommender
that captures universal interaction patterns by training on generic user-item
interaction data extracted from different domains, which can then be fast
adapted to improve few-shot learning performance in unseen new domains (with
limited data).
However, unlike vision/language data which share strong conformity in the
semantic space, universal patterns underlying recommendation data collected
across different domains (e.g., different countries or different E-commerce
platforms) are often occluded by both in-domain and cross-domain biases
implicitly imposed by the cultural differences in their user and item bases, as
well as their uses of different e-commerce platforms. As shown in our
experiments, such heterogeneous biases in the data tend to hinder the
effectiveness of the pre-trained model. To address this challenge, we further
introduce and formalize a causal debiasing perspective, which is substantiated
via a hierarchical Bayesian deep learning model, named PreRec. Our empirical
studies on real-world data show that the proposed model could significantly
improve the recommendation performance in zero- and few-shot learning settings
under both cross-market and cross-platform scenarios.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - ReCoRe: Regularized Contrastive Representation Learning of World Model [21.29132219042405]
We present a world model that learns invariant features using contrastive unsupervised learning and an intervention-invariant regularizer.
Our method outperforms current state-of-the-art model-based and model-free RL methods and significantly improves on out-of-distribution point navigation tasks evaluated on the iGibson benchmark.
arXiv Detail & Related papers (2023-12-14T15:53:07Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - An Information-Theoretic Approach for Estimating Scenario Generalization
in Crowd Motion Prediction [27.10815774845461]
We propose a novel scoring method, which characterizes generalization of models trained on source crowd scenarios and applied to target crowd scenarios.
The Interaction component aims to characterize the difficulty of scenario domains, while the diversity of a scenario domain is captured in the Diversity score.
Our experimental results validate the efficacy of the proposed method on several simulated and real-world (source,target) generalization tasks.
arXiv Detail & Related papers (2022-11-02T01:39:30Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Improving QA Generalization by Concurrent Modeling of Multiple Biases [61.597362592536896]
Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets.
We propose a general framework for improving the performance on both in-domain and out-of-domain datasets by concurrent modeling of multiple biases in the training data.
We extensively evaluate our framework on extractive question answering with training data from various domains with multiple biases of different strengths.
arXiv Detail & Related papers (2020-10-07T11:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.