Out-of-sample scoring and automatic selection of causal estimators
- URL: http://arxiv.org/abs/2212.10076v1
- Date: Tue, 20 Dec 2022 08:29:18 GMT
- Title: Out-of-sample scoring and automatic selection of causal estimators
- Authors: Egor Kraev, Timo Flesch, Hudson Taylor Lekunze, Mark Harley, Pere
Planell Morell
- Abstract summary: We propose novel scoring approaches for both the CATE case and an important subset of instrumental variable problems.
We implement that in an open source package that relies on DoWhy and EconML libraries.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recently, many causal estimators for Conditional Average Treatment Effect
(CATE) and instrumental variable (IV) problems have been published and open
sourced, allowing to estimate granular impact of both randomized treatments
(such as A/B tests) and of user choices on the outcomes of interest. However,
the practical application of such models has ben hampered by the lack of a
valid way to score the performance of such models out of sample, in order to
select the best one for a given application. We address that gap by proposing
novel scoring approaches for both the CATE case and an important subset of
instrumental variable problems, namely those where the instrumental variable is
customer acces to a product feature, and the treatment is the customer's choice
to use that feature. Being able to score model performance out of sample allows
us to apply hyperparameter optimization methods to causal model selection and
tuning. We implement that in an open source package that relies on DoWhy and
EconML libraries for implementation of causal inference models (and also
includes a Transformed Outcome model implementation), and on FLAML for
hyperparameter optimization and for component models used in the causal models.
We demonstrate on synthetic data that optimizing the proposed scores is a
reliable method for choosing the model and its hyperparameter values, whose
estimates are close to the true impact, in the randomized CATE and IV cases.
Further, we provide examles of applying these methods to real customer data
from Wise.
Related papers
- On the Laplace Approximation as Model Selection Criterion for Gaussian Processes [6.990493129893112]
We introduce multiple metrics based on the Laplace approximation.
Experiments show that our metrics are comparable in quality to the gold standard dynamic nested sampling.
arXiv Detail & Related papers (2024-03-14T09:28:28Z) - Risk-Sensitive Diffusion for Perturbation-Robust Optimization [58.68233326265417]
We show that noisy samples incur another objective function, rather than the one with score function, which will wrongly optimize the model.
We introduce risk-sensitive SDE, a type of differential equation (SDE) parameterized by the risk vector.
We prove that zero instability measure is only achievable in the case where noisy samples are caused by Gaussian perturbation.
arXiv Detail & Related papers (2024-02-03T08:41:51Z) - Causal Q-Aggregation for CATE Model Selection [24.094860486378167]
We propose a new CATE ensembling approach based on Qaggregation using the doubly robust loss.
Our main result shows that causal Q-aggregation achieves statistically optimal model selection regret rates.
arXiv Detail & Related papers (2023-10-25T19:27:05Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - A prediction and behavioural analysis of machine learning methods for
modelling travel mode choice [0.26249027950824505]
We conduct a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice.
Results indicate that the models with the highest disaggregate predictive performance provide poorer estimates of behavioural indicators and aggregate mode shares.
It is also observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.
arXiv Detail & Related papers (2023-01-11T11:10:32Z) - Exploring validation metrics for offline model-based optimisation with
diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle.
While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples.
This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z) - Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation [24.65301562548798]
We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation.
We conduct an empirical analysis to benchmark the surrogate model selection metrics introduced in the literature, as well as the novel ones introduced in this work.
arXiv Detail & Related papers (2022-11-03T16:26:06Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Variational Inference with NoFAS: Normalizing Flow with Adaptive
Surrogate for Computationally Expensive Models [7.217783736464403]
Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive.
New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space.
We propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and the weights of a neural network surrogate model.
arXiv Detail & Related papers (2021-08-28T14:31:45Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Selecting Treatment Effects Models for Domain Adaptation Using Causal
Knowledge [82.5462771088607]
We propose a novel model selection metric specifically designed for ITE methods under the unsupervised domain adaptation setting.
In particular, we propose selecting models whose predictions of interventions' effects satisfy known causal structures in the target domain.
arXiv Detail & Related papers (2021-02-11T21:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.