CDR: Conservative Doubly Robust Learning for Debiased Recommendation
- URL: http://arxiv.org/abs/2308.08461v2
- Date: Thu, 17 Aug 2023 05:30:03 GMT
- Title: CDR: Conservative Doubly Robust Learning for Debiased Recommendation
- Authors: ZiJie Song, JiaWei Chen, Sheng Zhou, QiHao Shi, Yan Feng, Chun Chen
and Can Wang
- Abstract summary: Doubly Robust Learning (DR) has gained significant attention due to its remarkable performance and robust properties.
To address this issue, this work proposes Conservative Doubly Robust strategy (CDR) which filters imputations by scrutinizing their mean and variance.
- Score: 23.90593406172408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recommendation systems (RS), user behavior data is observational rather
than experimental, resulting in widespread bias in the data. Consequently,
tackling bias has emerged as a major challenge in the field of recommendation
systems. Recently, Doubly Robust Learning (DR) has gained significant attention
due to its remarkable performance and robust properties. However, our
experimental findings indicate that existing DR methods are severely impacted
by the presence of so-called Poisonous Imputation, where the imputation
significantly deviates from the truth and becomes counterproductive.
To address this issue, this work proposes Conservative Doubly Robust strategy
(CDR) which filters imputations by scrutinizing their mean and variance.
Theoretical analyses show that CDR offers reduced variance and improved tail
bounds.In addition, our experimental investigations illustrate that CDR
significantly enhances performance and can indeed reduce the frequency of
poisonous imputation.
Related papers
- Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework [77.45983464131977]
We focus on how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications.
Our research identifies two critical latent factors affecting RAG's confidence in its predictions.
We develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers.
arXiv Detail & Related papers (2024-09-24T14:52:14Z) - Generalized Encouragement-Based Instrumental Variables for Counterfactual Regression [33.869488994843394]
This paper introduces novel theories and algorithms for identifying the Conditional Average Treatment Effect (CATE) using variations in encouragement.
By leveraging both observational and encouragement data, we propose a generalized IV estimator, named Encouragement-based Counterfactual Regression (EnCounteR) to effectively estimate the causal effects.
arXiv Detail & Related papers (2024-08-10T04:21:04Z) - How to Train Your DRAGON: Diverse Augmentation Towards Generalizable
Dense Retrieval [80.54532535622988]
We show that a generalizable dense retriever can be trained to achieve high accuracy in both supervised and zero-shot retrieval.
DRAGON, our dense retriever trained with diverse augmentation, is the first BERT-base-sized DR to achieve state-of-the-art effectiveness in both supervised and zero-shot evaluations.
arXiv Detail & Related papers (2023-02-15T03:53:26Z) - A Generalized Doubly Robust Learning Framework for Debiasing Post-Click
Conversion Rate Prediction [23.340584290411208]
Post-click conversion rate (CVR) prediction is an essential task for discovering user interests and increasing platform revenues.
Currently, doubly robust (DR) learning approaches achieve the state-of-the-art performance for debiasing CVR prediction.
We propose two new DR methods, namely DR-BIAS and DR-MSE, which control the bias of DR loss and balance the bias and variance flexibly.
arXiv Detail & Related papers (2022-11-12T15:09:23Z) - StableDR: Stabilized Doubly Robust Learning for Recommendation on Data
Missing Not at Random [16.700598755439685]
We show that the doubly robust (DR) methods are unstable and have unbounded bias, variance, and generalization bounds to extremely small propensities.
We propose a doubly robust (StableDR) learning approach with a weaker reliance on extrapolation.
In addition, we propose a novel learning approach for StableDR that updates the imputation, propensity, and prediction models cyclically.
arXiv Detail & Related papers (2022-05-10T07:04:53Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - Doubly Robust Collaborative Targeted Learning for Recommendation on Data
Missing Not at Random [6.563595953273317]
In recommender systems, the feedback data received is always missing not at random (MNAR)
We propose bf DR-TMLE that effectively captures the merits of both error imputation-based (EIB) and doubly robust (DR) methods.
We also propose a novel RCT-free collaborative targeted learning algorithm for DR-TMLE, called bf DR-TMLE-TL
arXiv Detail & Related papers (2022-03-19T06:48:50Z) - Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning [59.02006924867438]
Off-policy evaluation and learning (OPE/L) use offline observational data to make better decisions.
Recent work proposed distributionally robust OPE/L (DROPE/L) to remedy this, but the proposal relies on inverse-propensity weighting.
We propose the first DR algorithms for DROPE/L with KL-divergence uncertainty sets.
arXiv Detail & Related papers (2022-02-19T20:00:44Z) - Assessment of Treatment Effect Estimators for Heavy-Tailed Data [70.72363097550483]
A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance.
We provide a novel cross-validation-like methodology to address this challenge.
We evaluate our methodology across 709 RCTs implemented in the Amazon supply chain.
arXiv Detail & Related papers (2021-12-14T17:53:01Z) - Causal Inference Q-Network: Toward Resilient Reinforcement Learning [57.96312207429202]
We consider a resilient DRL framework with observational interferences.
Under this framework, we propose a causal inference based DRL algorithm called causal inference Q-network (CIQ)
Our experimental results show that the proposed CIQ method could achieve higher performance and more resilience against observational interferences.
arXiv Detail & Related papers (2021-02-18T23:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.