Debiasing the Cloze Task in Sequential Recommendation with Bidirectional
Transformers
- URL: http://arxiv.org/abs/2301.09210v1
- Date: Sun, 22 Jan 2023 21:44:25 GMT
- Title: Debiasing the Cloze Task in Sequential Recommendation with Bidirectional
Transformers
- Authors: Khalil Damak, Sami Khenissi, Olfa Nasraoui
- Abstract summary: We argue that Inverse Propensity Scoring (IPS) does not extend to sequential recommendation because it fails to account for the temporal nature of the problem.
We then propose a novel propensity scoring mechanism, which can theoretically debias the Cloze task in sequential recommendation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bidirectional Transformer architectures are state-of-the-art sequential
recommendation models that use a bi-directional representation capacity based
on the Cloze task, a.k.a. Masked Language Modeling. The latter aims to predict
randomly masked items within the sequence. Because they assume that the true
interacted item is the most relevant one, an exposure bias results, where
non-interacted items with low exposure propensities are assumed to be
irrelevant. The most common approach to mitigating exposure bias in
recommendation has been Inverse Propensity Scoring (IPS), which consists of
down-weighting the interacted predictions in the loss function in proportion to
their propensities of exposure, yielding a theoretically unbiased learning. In
this work, we argue and prove that IPS does not extend to sequential
recommendation because it fails to account for the temporal nature of the
problem. We then propose a novel propensity scoring mechanism, which can
theoretically debias the Cloze task in sequential recommendation. Finally we
empirically demonstrate the debiasing capabilities of our proposed approach and
its robustness to the severity of exposure bias.
Related papers
- Mitigating Exposure Bias in Online Learning to Rank Recommendation: A Novel Reward Model for Cascading Bandits [23.15042648884445]
We study exposure bias in a class of well-known contextual bandit algorithms known as Linear Cascading Bandits.
We propose an Exposure-Aware reward model that updates the model parameters based on two factors: 1) implicit user feedback and 2) the position of the item in the recommendation list.
arXiv Detail & Related papers (2024-08-08T09:35:01Z) - Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases [76.9127853906115]
Bridging the gap between diffusion models and human preferences is crucial for their integration into practical generative.
We propose Temporal Diffusion Policy Optimization with critic active neuron Reset (TDPO-R), a policy gradient algorithm that exploits the temporal inductive bias of diffusion models.
Empirical results demonstrate the superior efficacy of our methods in mitigating reward overoptimization.
arXiv Detail & Related papers (2024-02-13T15:55:41Z) - A Dense Reward View on Aligning Text-to-Image Diffusion with Preference [54.43177605637759]
We propose a tractable alignment objective that emphasizes the initial steps of the T2I reverse chain.
In experiments on single and multiple prompt generation, our method is competitive with strong relevant baselines.
arXiv Detail & Related papers (2024-02-13T07:37:24Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation [59.500347564280204]
We propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework.
AUR consists of a new uncertainty estimator along with a normal recommender model.
As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty.
arXiv Detail & Related papers (2022-09-22T04:32:51Z) - Ensembling over Classifiers: a Bias-Variance Perspective [13.006468721874372]
We build upon the extension to the bias-variance decomposition by Pfau (2013) in order to gain crucial insights into the behavior of ensembles of classifiers.
We show that conditional estimates necessarily incur an irreducible error.
Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction.
arXiv Detail & Related papers (2022-06-21T17:46:35Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - Sequential Recommendation via Stochastic Self-Attention [68.52192964559829]
Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items.
We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues.
We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
arXiv Detail & Related papers (2022-01-16T12:38:45Z) - Debiased Explainable Pairwise Ranking from Implicit Feedback [0.3867363075280543]
We focus on the state of the art pairwise ranking model, Bayesian Personalized Ranking (BPR)
BPR is a black box model that does not explain its outputs, thus limiting the user's trust in the recommendations.
We propose a novel explainable loss function and a corresponding Matrix Factorization-based model that generates recommendations along with item-based explanations.
arXiv Detail & Related papers (2021-07-30T17:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.