Related papers: ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation

ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation

URL: http://arxiv.org/abs/2204.05125v1
Date: Sun, 3 Apr 2022 08:12:27 GMT
Title: ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation
Authors: Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, Wei Chu
Abstract summary: Methods in Entire Space Multi-task Model (ESMM) family leverage sequential pattern of user actions to address data sparsity issue. ESMM suffers from Inherent Estimation Bias (IEB) and Potential Independence Priority (PIP) issues. We devise a principled approach named Entire Space Counterfactual Multi-task Modelling (ESCM$2$), which employs a counterfactual risk miminizer as a regularizer.
Score: 14.346868328637115
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Accurate estimation of post-click conversion rate is critical for building recommender systems, which has long been confronted with sample selection bias and data sparsity issues. Methods in the Entire Space Multi-task Model (ESMM) family leverage the sequential pattern of user actions, i.e. $impression\rightarrow click \rightarrow conversion$ to address data sparsity issue. However, they still fail to ensure the unbiasedness of CVR estimates. In this paper, we theoretically demonstrate that ESMM suffers from the following two problems: (1) Inherent Estimation Bias (IEB), where the estimated CVR of ESMM is inherently higher than the ground truth; (2) Potential Independence Priority (PIP) for CTCVR estimation, where there is a risk that the ESMM overlooks the causality from click to conversion. To this end, we devise a principled approach named Entire Space Counterfactual Multi-task Modelling (ESCM$^2$), which employs a counterfactual risk miminizer as a regularizer in ESMM to address both IEB and PIP issues simultaneously. Extensive experiments on offline datasets and online environments demonstrate that our proposed ESCM$^2$ can largely mitigate the inherent IEB and PIP issues and achieve better performance than baseline models.

Related papers

Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness [61.87055159919641]
Multi-modal semantic segmentation (MMSS) addresses the limitations of single-modality data by integrating complementary information across modalities. Despite notable progress, a significant gap persists between research and real-world deployment due to variability and uncertainty in multi-modal data quality. We introduce a robustness benchmark that evaluates MMSS models under three scenarios: Entire-Missing Modality (EMM), Random-Missing Modality (RMM), and Noisy Modality (NM)
arXiv Detail & Related papers (2025-03-24T08:46:52Z)
SureMap: Simultaneous Mean Estimation for Single-Task and Multi-Task Disaggregated Evaluation [75.56845750400116]
Disaggregated evaluation -- estimation of performance of a machine learning model on different subpopulations -- is a core task when assessing performance and group-fairness of AI systems. We develop SureMap that has high estimation accuracy for both multi-task and single-task disaggregated evaluations of blackbox models. Our method combines maximum a posteriori (MAP) estimation using a well-chosen prior together with cross-validation-free tuning via Stein's unbiased risk estimate (SURE)
arXiv Detail & Related papers (2024-11-14T17:53:35Z)
Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs [63.47351876442425]
We study episodic linear mixture MDPs with the unknown transition and adversarial rewards under full-information feedback. We propose a novel algorithm that combines the benefits of two popular methods: occupancy-measure-based and policy-based. Our algorithm enjoys an $widetildemathcalO(d sqrtH3 K + sqrtHK(H + barP_K$)$ dynamic regret, where $d$ is the feature dimension.
arXiv Detail & Related papers (2024-11-05T13:55:52Z)
Pairwise Ranking Loss for Multi-Task Learning in Recommender Systems [8.824514065551865]
In online advertising systems, tasks like Click-Through Rate (CTR) and Conversion Rate (CVR) are often treated as MTL problems concurrently. In this study, exposure labels corresponding to conversions are regarded as definitive indicators. A novel task-specific loss is introduced by calculating a textbfpairtextbfwise textbfranking (PWiseR) loss between model predictions.
arXiv Detail & Related papers (2024-06-04T09:52:41Z)
DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning [52.83811558753284]
Decentralized learning (DFL) discards the central server and establishes a decentralized communication network. Existing DFL methods still suffer from two major challenges: local inconsistency and local overfitting.
arXiv Detail & Related papers (2023-08-16T11:22:36Z)
ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint [38.561040267729105]
We propose a novel Entire Space Multi-Task Model for Post-Click Conversion Rate via Constraint Experiments. We handle "exposure_click_in-shop action" and "in-shop action_purchase" separately in the light of characteristics of in-shop action.
arXiv Detail & Related papers (2023-07-18T12:25:40Z)
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage [15.858892479232656]
We study robust offline reinforcement learning (robust offline RL) We propose a generic algorithm framework called Doubly Pessimistic Model-based Policy Optimization ($P2MPO$) We show that $P2MPO$ enjoys a $tildemathcalO(n-1/2)$ convergence rate, where $n$ is the dataset size.
arXiv Detail & Related papers (2023-05-16T17:58:05Z)
Entire Space Counterfactual Learning: Tuning, Analytical Properties and Industrial Applications [5.9460659646670875]
Post-click conversion rate (CVR) estimation has long been plagued by sample selection bias and data sparsity issues. This paper proposes a principled method named entire space counterfactual multi-task model (ESCM$2$), which employs a counterfactual risk minimizer to handle both IEB and PIP issues at once.
arXiv Detail & Related papers (2022-10-20T06:19:50Z)
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI [67.13999010060057]
We propose a novel approach to integrate LF-MMI criterion into E2E ASR frameworks in both training and decoding stages. Experiments suggest that the introduction of the LF-MMI criterion consistently leads to significant performance improvements.
arXiv Detail & Related papers (2021-12-05T07:30:17Z)
Predicting Status of Pre and Post M&A Deals Using Machine Learning and Deep Learning Techniques [0.0]
Risk arbitrage or merger arbitrage is an investment strategy that speculates on the success of M&A deals. Prediction of the deal status in advance is of great importance for risk arbitrageurs. We present an ML and DL based methodology for takeover success prediction problem.
arXiv Detail & Related papers (2021-08-05T21:26:45Z)
Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes. We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks. Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z)
Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments [55.24895403089543]
Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments. We present a new algorithm based on performing iterative feature matching that is guaranteed with high probability to yield a predictor that generalizes after seeing only $O(logd_s)$ environments.
arXiv Detail & Related papers (2021-06-18T04:39:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.