TSCAN: Context-Aware Uplift Modeling via Two-Stage Training for Online Merchant Business Diagnosis
- URL: http://arxiv.org/abs/2504.18881v1
- Date: Sat, 26 Apr 2025 10:00:16 GMT
- Title: TSCAN: Context-Aware Uplift Modeling via Two-Stage Training for Online Merchant Business Diagnosis
- Authors: Hangtao Zhang, Zhe Li, Kairui Zhang,
- Abstract summary: We propose a Context-Aware uplift model based on the Two-Stage training approach (TSCAN)<n>In the first stage, we train an uplift model, called CAN-U, which includes the treatment regularizations of IPM and propensity score prediction.<n>In the second stage, we train a model named CAN-D, which utilizes an isotonic output layer to directly model uplift effects.
- Score: 2.8438369256032416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A primary challenge in ITE estimation is sample selection bias. Traditional approaches utilize treatment regularization techniques such as the Integral Probability Metrics (IPM), re-weighting, and propensity score modeling to mitigate this bias. However, these regularizations may introduce undesirable information loss and limit the performance of the model. Furthermore, treatment effects vary across different external contexts, and the existing methods are insufficient in fully interacting with and utilizing these contextual features. To address these issues, we propose a Context-Aware uplift model based on the Two-Stage training approach (TSCAN), comprising CAN-U and CAN-D sub-models. In the first stage, we train an uplift model, called CAN-U, which includes the treatment regularizations of IPM and propensity score prediction, to generate a complete dataset with counterfactual uplift labels. In the second stage, we train a model named CAN-D, which utilizes an isotonic output layer to directly model uplift effects, thereby eliminating the reliance on the regularization components. CAN-D adaptively corrects the errors estimated by CAN-U through reinforcing the factual samples, while avoiding the negative impacts associated with the aforementioned regularizations. Additionally, we introduce a Context-Aware Attention Layer throughout the two-stage process to manage the interactions between treatment, merchant, and contextual features, thereby modeling the varying treatment effect in different contexts. We conduct extensive experiments on two real-world datasets to validate the effectiveness of TSCAN. Ultimately, the deployment of our model for real-world merchant diagnosis on one of China's largest online food ordering platforms validates its practical utility and impact.
Related papers
- Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning [93.58897637077001]
This paper tries to learn and understand underlying semantic variations from distracting videos via offline-to-online latent distillation and flexible disentanglement constraints.<n>We pretrain the action-free video prediction model offline with disentanglement regularization to extract semantic knowledge from distracting videos.<n>For finetuning in the online environment, we exploit the knowledge from the pretrained model and introduce a disentanglement constraint to the world model.
arXiv Detail & Related papers (2025-03-11T13:50:22Z) - MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Towards Clinician-Preferred Segmentation: Leveraging Human-in-the-Loop for Test Time Adaptation in Medical Image Segmentation [10.65123164779962]
Deep learning-based medical image segmentation models often face performance degradation when deployed across various medical centers.
We propose a novel Human-in-the-loop TTA framework that capitalizes on the largely overlooked potential of clinician-corrected predictions.
Our framework conceives a divergence loss, designed specifically to diminish the prediction divergence instigated by domain disparities.
arXiv Detail & Related papers (2024-05-14T02:02:15Z) - Estimation of individual causal effects in network setup for multiple
treatments [4.53340898566495]
We study the problem of estimation of Individual Treatment Effects (ITE) in the context of multiple treatments and observational data.
We employ Graph Convolutional Networks (GCN) to learn a shared representation of the confounders.
Our approach utilizes separate neural networks to infer potential outcomes for each treatment.
arXiv Detail & Related papers (2023-12-18T06:07:45Z) - Improving Speech Translation by Cross-Modal Multi-Grained Contrastive
Learning [8.501945512734268]
We propose the FCCL (Fine- and Coarse- Granularity Contrastive Learning) approach for E2E-ST.
A key ingredient of our approach is applying contrastive learning at both sentence- and frame-level to give the comprehensive guide for extracting speech representations containing rich semantic information.
Experiments on the MuST-C benchmark show that our proposed approach significantly outperforms the state-of-the-art E2E-ST baselines on all eight language pairs.
arXiv Detail & Related papers (2023-04-20T13:41:56Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Relational Action Bases: Formalization, Effective Safety Verification,
and Invariants (Extended Version) [67.99023219822564]
We introduce the general framework of relational action bases (RABs)
RABs generalize existing models by lifting both restrictions.
We demonstrate the effectiveness of this approach on a benchmark of data-aware business processes.
arXiv Detail & Related papers (2022-08-12T17:03:50Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Enhancing Counterfactual Classification via Self-Training [9.484178349784264]
We propose a self-training algorithm which imputes outcomes with categorical values for finite unseen actions in observational data to simulate a randomized trial through pseudolabeling.
We demonstrate the effectiveness of the proposed algorithms on both synthetic and real datasets.
arXiv Detail & Related papers (2021-12-08T18:42:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.