CATE Lasso: Conditional Average Treatment Effect Estimation with
High-Dimensional Linear Regression
- URL: http://arxiv.org/abs/2310.16819v1
- Date: Wed, 25 Oct 2023 17:51:07 GMT
- Title: CATE Lasso: Conditional Average Treatment Effect Estimation with
High-Dimensional Linear Regression
- Authors: Masahiro Kato and Masaaki Imaizumi
- Abstract summary: Conditional Average Treatment Effects (CATEs) play an important role as a quantity representing an individualized causal effect.
We propose a method for consistently estimating CATEs even under high-dimensional and non-sparse parameters.
- Score: 18.628644958430076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In causal inference about two treatments, Conditional Average Treatment
Effects (CATEs) play an important role as a quantity representing an
individualized causal effect, defined as a difference between the expected
outcomes of the two treatments conditioned on covariates. This study assumes
two linear regression models between a potential outcome and covariates of the
two treatments and defines CATEs as a difference between the linear regression
models. Then, we propose a method for consistently estimating CATEs even under
high-dimensional and non-sparse parameters. In our study, we demonstrate that
desirable theoretical properties, such as consistency, remain attainable even
without assuming sparsity explicitly if we assume a weaker assumption called
implicit sparsity originating from the definition of CATEs. In this assumption,
we suppose that parameters of linear models in potential outcomes can be
divided into treatment-specific and common parameters, where the
treatment-specific parameters take difference values between each linear
regression model, while the common parameters remain identical. Thus, in a
difference between two linear regression models, the common parameters
disappear, leaving only differences in the treatment-specific parameters.
Consequently, the non-zero parameters in CATEs correspond to the differences in
the treatment-specific parameters. Leveraging this assumption, we develop a
Lasso regression method specialized for CATE estimation and present that the
estimator is consistent. Finally, we confirm the soundness of the proposed
method by simulation studies.
Related papers
- Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - De-confounding Representation Learning for Counterfactual Inference on
Continuous Treatment via Generative Adversarial Network [5.465397606401007]
Counterfactual inference for continuous rather than binary treatment variables is more common in real-world causal inference tasks.
We propose a de-confounding representation learning (DRL) framework for counterfactual outcome estimation of continuous treatment.
We show that the DRL model performs superiorly in learning de-confounding representations and outperforms state-of-the-art counterfactual inference models for continuous treatment variables.
arXiv Detail & Related papers (2023-07-24T08:56:25Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Robust and Agnostic Learning of Conditional Distributional Treatment
Effects [62.44901952244514]
The conditional average treatment effect (CATE) is the best point prediction of individual causal effects.
In aggregate analyses, this is usually addressed by measuring distributional treatment effect (DTE)
We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a wide class of problems.
arXiv Detail & Related papers (2022-05-23T17:40:31Z) - Modeling High-Dimensional Data with Unknown Cut Points: A Fusion
Penalized Logistic Threshold Regression [2.520538806201793]
In traditional logistic regression models, the link function is often assumed to be linear and continuous in predictors.
We consider a threshold model that all continuous features are discretized into ordinal levels, which further determine the binary responses.
We find the lasso model is well suited in the problem of early detection and prediction for chronic disease like diabetes.
arXiv Detail & Related papers (2022-02-17T04:16:40Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Variable selection with missing data in both covariates and outcomes:
Imputation and machine learning [1.0333430439241666]
The missing data issue is ubiquitous in health studies.
Machine learning methods weaken parametric assumptions.
XGBoost and BART have the overall best performance across various settings.
arXiv Detail & Related papers (2021-04-06T20:18:29Z) - Bayesian prognostic covariate adjustment [59.75318183140857]
Historical data about disease outcomes can be integrated into the analysis of clinical trials in many ways.
We build on existing literature that uses prognostic scores from a predictive model to increase the efficiency of treatment effect estimates.
arXiv Detail & Related papers (2020-12-24T05:19:03Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - Doubly Robust Semiparametric Difference-in-Differences Estimators with
High-Dimensional Data [15.27393561231633]
We propose a doubly robust two-stage semiparametric difference-in-difference estimator for estimating heterogeneous treatment effects.
The first stage allows a general set of machine learning methods to be used to estimate the propensity score.
In the second stage, we derive the rates of convergence for both the parametric parameter and the unknown function.
arXiv Detail & Related papers (2020-09-07T15:14:29Z) - Assumption-lean inference for generalised linear model parameters [0.0]
We propose nonparametric definitions of main effect estimands and effect modification estimands.
These reduce to standard main effect and effect modification parameters in generalised linear models when these models are correctly specified.
We achieve an assumption-lean inference for these estimands.
arXiv Detail & Related papers (2020-06-15T13:49:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.