ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click
Conversion Rate Estimation
- URL: http://arxiv.org/abs/2204.05125v1
- Date: Sun, 3 Apr 2022 08:12:27 GMT
- Title: ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click
Conversion Rate Estimation
- Authors: Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen,
Chao Yu, Ruopeng Li, Wei Chu
- Abstract summary: Methods in Entire Space Multi-task Model (ESMM) family leverage sequential pattern of user actions to address data sparsity issue.
ESMM suffers from Inherent Estimation Bias (IEB) and Potential Independence Priority (PIP) issues.
We devise a principled approach named Entire Space Counterfactual Multi-task Modelling (ESCM$2$), which employs a counterfactual risk miminizer as a regularizer.
- Score: 14.346868328637115
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate estimation of post-click conversion rate is critical for building
recommender systems, which has long been confronted with sample selection bias
and data sparsity issues. Methods in the Entire Space Multi-task Model (ESMM)
family leverage the sequential pattern of user actions, i.e.
$impression\rightarrow click \rightarrow conversion$ to address data sparsity
issue. However, they still fail to ensure the unbiasedness of CVR estimates. In
this paper, we theoretically demonstrate that ESMM suffers from the following
two problems: (1) Inherent Estimation Bias (IEB), where the estimated CVR of
ESMM is inherently higher than the ground truth; (2) Potential Independence
Priority (PIP) for CTCVR estimation, where there is a risk that the ESMM
overlooks the causality from click to conversion. To this end, we devise a
principled approach named Entire Space Counterfactual Multi-task Modelling
(ESCM$^2$), which employs a counterfactual risk miminizer as a regularizer in
ESMM to address both IEB and PIP issues simultaneously. Extensive experiments
on offline datasets and online environments demonstrate that our proposed
ESCM$^2$ can largely mitigate the inherent IEB and PIP issues and achieve
better performance than baseline models.
Related papers
- Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs [63.47351876442425]
We study episodic linear mixture MDPs with the unknown transition and adversarial rewards under full-information feedback.
We propose a novel algorithm that combines the benefits of two popular methods: occupancy-measure-based and policy-based.
Our algorithm enjoys an $widetildemathcalO(d sqrtH3 K + sqrtHK(H + barP_K$)$ dynamic regret, where $d$ is the feature dimension.
arXiv Detail & Related papers (2024-11-05T13:55:52Z) - Breaking Boundaries: Balancing Performance and Robustness in Deep
Wireless Traffic Forecasting [11.029214459961114]
Balancing the trade-off between accuracy and robustness is a long-standing challenge in time series forecasting.
We study a wide array of perturbation scenarios and propose novel defense mechanisms against adversarial attacks using real-world telecom data.
arXiv Detail & Related papers (2023-11-16T11:10:38Z) - DFedADMM: Dual Constraints Controlled Model Inconsistency for
Decentralized Federated Learning [52.83811558753284]
Decentralized learning (DFL) discards the central server and establishes a decentralized communication network.
Existing DFL methods still suffer from two major challenges: local inconsistency and local overfitting.
arXiv Detail & Related papers (2023-08-16T11:22:36Z) - ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via
Parameter Constraint [38.561040267729105]
We propose a novel Entire Space Multi-Task Model for Post-Click Conversion Rate via Constraint Experiments.
We handle "exposure_click_in-shop action" and "in-shop action_purchase" separately in the light of characteristics of in-shop action.
arXiv Detail & Related papers (2023-07-18T12:25:40Z) - Double Pessimism is Provably Efficient for Distributionally Robust
Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage [15.858892479232656]
We study robust offline reinforcement learning (robust offline RL)
We propose a generic algorithm framework called Doubly Pessimistic Model-based Policy Optimization ($P2MPO$)
We show that $P2MPO$ enjoys a $tildemathcalO(n-1/2)$ convergence rate, where $n$ is the dataset size.
arXiv Detail & Related papers (2023-05-16T17:58:05Z) - Entire Space Counterfactual Learning: Tuning, Analytical Properties and
Industrial Applications [5.9460659646670875]
Post-click conversion rate (CVR) estimation has long been plagued by sample selection bias and data sparsity issues.
This paper proposes a principled method named entire space counterfactual multi-task model (ESCM$2$), which employs a counterfactual risk minimizer to handle both IEB and PIP issues at once.
arXiv Detail & Related papers (2022-10-20T06:19:50Z) - Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with
Probabilistically Triggered Arms or Independent Arms [53.89752069984382]
We study the semi-bandits (CMAB) and focus on reducing the dependency of the batch-size $K$ in the regret bound.
First, for the setting of CMAB with probabilistically triggered arms (CMAB-T), we propose a BCUCB-T algorithm with variance-aware confidence intervals.
Second, for the setting of non-triggering CMAB with independent arms, we propose a SESCB algorithm which leverages on the non-triggering version of the TPVM condition.
arXiv Detail & Related papers (2022-08-31T13:09:39Z) - Consistent Training and Decoding For End-to-end Speech Recognition Using
Lattice-free MMI [67.13999010060057]
We propose a novel approach to integrate LF-MMI criterion into E2E ASR frameworks in both training and decoding stages.
Experiments suggest that the introduction of the LF-MMI criterion consistently leads to significant performance improvements.
arXiv Detail & Related papers (2021-12-05T07:30:17Z) - Predicting Status of Pre and Post M&A Deals Using Machine Learning and
Deep Learning Techniques [0.0]
Risk arbitrage or merger arbitrage is an investment strategy that speculates on the success of M&A deals.
Prediction of the deal status in advance is of great importance for risk arbitrageurs.
We present an ML and DL based methodology for takeover success prediction problem.
arXiv Detail & Related papers (2021-08-05T21:26:45Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - Iterative Feature Matching: Toward Provable Domain Generalization with
Logarithmic Environments [55.24895403089543]
Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments.
We present a new algorithm based on performing iterative feature matching that is guaranteed with high probability to yield a predictor that generalizes after seeing only $O(logd_s)$ environments.
arXiv Detail & Related papers (2021-06-18T04:39:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.