Related papers: GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

URL: http://arxiv.org/abs/2203.10975v1
Date: Mon, 21 Mar 2022 13:35:55 GMT
Title: GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace
Authors: Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo
Abstract summary: Uplift modeling is a rapidly growing approach that utilizes machine learning and causal inference methods to estimate the heterogeneous treatment effects. We extend causal forest (CF) with non-parametric dose-response functions (DRFs) that can be estimated locally using a kernel-based doubly robust estimator. We show the effectiveness of GCF by comparing it to popular uplift modeling models on both synthetic and real-world datasets.
Score: 12.114394141790438
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Uplift modeling is a rapidly growing approach that utilizes machine learning and causal inference methods to estimate the heterogeneous treatment effects. It has been widely adopted and applied to online marketplaces to assist large-scale decision-making in recent years. The existing popular methods, like forest-based modeling, either work only for discrete treatments or make partially linear or parametric assumptions that may suffer from model misspecification. To alleviate these problems, we extend causal forest (CF) with non-parametric dose-response functions (DRFs) that can be estimated locally using a kernel-based doubly robust estimator. Moreover, we propose a distance-based splitting criterion in the functional space of conditional DRFs to capture the heterogeneity for the continuous treatments. We call the proposed algorithm generalized causal forest (GCF) as it generalizes the use case of CF to a much broader setup. We show the effectiveness of GCF by comparing it to popular uplift modeling models on both synthetic and real-world datasets. We implement GCF in Spark and successfully deploy it into DiDi's real-time pricing system. Online A/B testing results further validate the superiority of GCF.

Related papers

Variational Autoencoder for Generating Broader-Spectrum prior Proposals in Markov chain Monte Carlo Methods [0.0]
This study uses a Variational Autoencoder method to enhance the efficiency and applicability of Markov Chain Monte Carlo (McMC) methods.<n>The VAE framework enables a data-driven approach to flexibly capture a broader range of correlation structures in inverse problems.
arXiv Detail & Related papers (2025-06-16T14:11:16Z)
Forests for Differences: Robust Causal Inference Beyond Parametric DiD [0.0]
Difference-in-Differences Bayesian Causal Forest (DiD-BCF) is a novel non-parametric model addressing key challenges in DiD estimation.<n>DiD-BCF provides a unified framework for estimating Average (ATE), Group-Average (GATE), and Conditional Average Treatment Effects (CATE)<n>Extensive simulations demonstrate DiD-BCF's superior performance over established benchmarks.
arXiv Detail & Related papers (2025-05-14T18:06:51Z)
Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation [63.66719748453878]
Group max-min fairness (MMF) is commonly used in fairness-aware recommender systems (RS) as an optimization objective. We present an efficient and effective algorithm named FairDual, which utilizes a dual optimization technique to minimize the Jensen gap. Our theoretical analysis demonstrates that FairDual can achieve a sub-linear convergence rate to the globally optimal solution.
arXiv Detail & Related papers (2025-02-13T13:33:45Z)
Rectified Diffusion Guidance for Conditional Generation [62.00207951161297]
We revisit the theory behind CFG and rigorously confirm that the improper configuration of the combination coefficients (i.e., the widely used summing-to-one version) brings about expectation shift of the generative distribution. We propose ReCFG with a relaxation on the guidance coefficients such that denoising with ReCFG strictly aligns with the diffusion theory. That way the rectified coefficients can be readily pre-computed via traversing the observed data, leaving the sampling speed barely affected.
arXiv Detail & Related papers (2024-10-24T13:41:32Z)
Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW) We derive regret bounds for CBO-MW that naturally depend on graph-related quantities. Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z)
Generalized Random Forests using Fixed-Point Trees [2.5944208050492183]
We propose a computationally efficient alternative to generalized random forests arXiv:1610.01271 (GRFs) for estimating heterogeneous effects in large dimensions. While GRFs rely on a gradient-based splitting criterion, our method introduces a fixed-point approximation that eliminates the need for Jacobian estimation. Our findings suggest that the proposed method is a scalable alternative for localized effect estimation in machine learning and causal inference applications.
arXiv Detail & Related papers (2023-06-20T21:45:35Z)
Random forests for binary geospatial data [0.0]
Existing implementations of random forests for binary data cannot explicitly account for data correlation common in geospatial and time-series settings. Recent work has extended random forests (RF) to RF-GLS that incorporate spatial covariance using the generalized least squares (GLS) loss. We show that for binary data, the GLS loss is also an extension of the Gini impurity measure, as the latter is exactly equivalent to the ordinary least squares (OLS) loss. We propose a novel link-inversion technique that embeds the RF-GLS estimate of the mean function from the first step within the generalized
arXiv Detail & Related papers (2023-02-27T14:34:33Z)
Model-based Causal Bayesian Optimization [78.120734120667]
We propose model-based causal Bayesian optimization (MCBO) MCBO learns a full system model instead of only modeling intervention-reward pairs. Unlike in standard Bayesian optimization, our acquisition function cannot be evaluated in closed form.
arXiv Detail & Related papers (2022-11-18T14:28:21Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Diffusion Causal Models for Counterfactual Estimation [18.438307666925425]
We consider the task of counterfactual estimation from observational imaging data given a known causal structure. We propose Diff-SCM, a deep structural causal model that builds on recent advances of generative energy-based models. We find that Diff-SCM produces more realistic and minimal counterfactuals than baselines on MNIST data and can also be applied to ImageNet data.
arXiv Detail & Related papers (2022-02-21T12:23:01Z)
ReLACE: Reinforcement Learning Agent for Counterfactual Explanations of Arbitrary Predictive Models [6.939617874336667]
We introduce a model-agnostic algorithm to generate optimal counterfactual explanations. Our method is easily applied to any black-box model, as this resembles the environment that the DRL agent interacts with. In addition, we develop an algorithm to extract explainable decision rules from the DRL agent's policy, so as to make the process of generating CFs itself transparent.
arXiv Detail & Related papers (2021-10-22T17:08:49Z)
A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling. We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk. We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z)
Causal Collaborative Filtering [50.22155187512759]
Causal Collaborative Filtering is a framework for modeling causality in collaborative filtering and recommendation. We show that many traditional CF algorithms are actually special cases of CCF under simplified causal graphs. We propose a conditional intervention approach for $do$-operations so that we can estimate the user-item causal preference.
arXiv Detail & Related papers (2021-02-03T04:16:11Z)
Estimating Linear Mixed Effects Models with Truncated Normally Distributed Random Effects [5.4052819252055055]
Inference can be conducted using maximum likelihood approach if assuming Normal distributions on the random effects. In this paper we extend the classical (unconstrained) LME models to allow for sign constraints on its overall coefficients.
arXiv Detail & Related papers (2020-11-09T16:17:35Z)
Flow Field Reconstructions with GANs based on Radial Basis Functions [19.261773760183196]
Two radial basis function-based GANs (RBF-GAN and RBFC-GAN) are proposed for regression and generation purposes. We show that the performance of the RBF-GAN and the RBFC-GAN are better than that of GANs/cGANs by means of both the mean square error (MSE) and the mean square percentage error (MSPE)
arXiv Detail & Related papers (2020-08-11T11:45:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.