Transfer Learning for Causal Effect Estimation
- URL: http://arxiv.org/abs/2305.09126v3
- Date: Mon, 1 Jan 2024 17:04:58 GMT
- Title: Transfer Learning for Causal Effect Estimation
- Authors: Song Wei, Hanyu Zhang, Ronald Moore, Rishikesan Kamaleswaran, Yao Xie
- Abstract summary: We present a Transfer Causal Learning framework to improve causal effect estimation accuracy in limited data.
Our method is subsequently extended to real data and generates meaningful insights consistent with medical literature.
- Score: 12.630663215983706
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a Transfer Causal Learning (TCL) framework when target and source
domains share the same covariate/feature spaces, aiming to improve causal
effect estimation accuracy in limited data. Limited data is very common in
medical applications, where some rare medical conditions, such as sepsis, are
of interest. Our proposed method, named \texttt{$\ell_1$-TCL}, incorporates
$\ell_1$ regularized TL for nuisance models (e.g., propensity score model); the
TL estimator of the nuisance parameters is plugged into downstream average
causal/treatment effect estimators (e.g., inverse probability weighted
estimator). We establish non-asymptotic recovery guarantees for the
\texttt{$\ell_1$-TCL} with generalized linear model (GLM) under the sparsity
assumption in the high-dimensional setting, and demonstrate the empirical
benefits of \texttt{$\ell_1$-TCL} through extensive numerical simulation for
GLM and recent neural network nuisance models. Our method is subsequently
extended to real data and generates meaningful insights consistent with medical
literature, a case where all baseline methods fail.
Related papers
- CALF: A Conditionally Adaptive Loss Function to Mitigate Class-Imbalanced Segmentation [0.2902243522110345]
Imbalanced datasets pose a challenge in training deep learning (DL) models for medical diagnostics.
We propose a novel, statistically driven, conditionally adaptive loss function (CALF) tailored to accommodate the conditions of imbalanced datasets in DL training.
arXiv Detail & Related papers (2025-04-06T12:03:33Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Data value estimation on private gradients [84.966853523107]
For gradient-based machine learning (ML) methods, the de facto differential privacy technique is perturbing the gradients with random noise.
Data valuation attributes the ML performance to the training data and is widely used in privacy-aware applications that require enforcing DP.
We show that the answer is no with the default approach of injecting i.i.d.random noise to the gradients because the estimation uncertainty of the data value estimation paradoxically linearly scales with more estimation budget.
We propose to instead inject carefully correlated noise to provably remove the linear scaling of estimation uncertainty w.r.t.the budget.
arXiv Detail & Related papers (2024-12-22T13:15:51Z) - Off-policy estimation with adaptively collected data: the power of online learning [20.023469636707635]
We consider estimation of a linear functional of the treatment effect using adaptively collected data.
We propose a general reduction scheme that allows one to produce a sequence of estimates for the treatment effect via online learning.
arXiv Detail & Related papers (2024-11-19T10:18:27Z) - Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models.
This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution.
We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z) - Conditional Distribution Function Estimation Using Neural Networks for
Censored and Uncensored Data [0.0]
We consider estimating the conditional distribution function using neural networks for both censored and uncensored data.
We show the proposed method possesses desirable performance, whereas the partial likelihood method yields biased estimates when model assumptions are violated.
arXiv Detail & Related papers (2022-07-06T01:12:22Z) - Robust and Agnostic Learning of Conditional Distributional Treatment
Effects [62.44901952244514]
The conditional average treatment effect (CATE) is the best point prediction of individual causal effects.
In aggregate analyses, this is usually addressed by measuring distributional treatment effect (DTE)
We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a wide class of problems.
arXiv Detail & Related papers (2022-05-23T17:40:31Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Variational Inference with NoFAS: Normalizing Flow with Adaptive
Surrogate for Computationally Expensive Models [7.217783736464403]
Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive.
New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space.
We propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and the weights of a neural network surrogate model.
arXiv Detail & Related papers (2021-08-28T14:31:45Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Dimensionality reduction, regularization, and generalization in
overparameterized regressions [8.615625517708324]
We show that PCA-OLS, also known as principal component regression, can be avoided with a dimensionality reduction.
We show that dimensionality reduction improves robustness while OLS is arbitrarily susceptible to adversarial attacks.
We find that methods in which the projection depends on the training data can outperform methods where the projections are chosen independently of the training data.
arXiv Detail & Related papers (2020-11-23T15:38:50Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.