Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data
- URL: http://arxiv.org/abs/2404.00221v1
- Date: Sat, 30 Mar 2024 02:33:39 GMT
- Title: Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data
- Authors: Shosei Sakaguchi,
- Abstract summary: We study statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's history.
We propose a step-wise doubly-robust approach to learn the optimal DTR using observational data under the assumption of sequential ignorability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many public policies and medical interventions involve dynamics in their treatment assignments, where treatments are sequentially assigned to the same individuals across multiple stages, and the effect of treatment at each stage is usually heterogeneous with respect to the history of prior treatments and associated characteristics. We study statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's history. We propose a step-wise doubly-robust approach to learn the optimal DTR using observational data under the assumption of sequential ignorability. The approach solves the sequential treatment assignment problem through backward induction, where, at each step, we combine estimators of propensity scores and action-value functions (Q-functions) to construct augmented inverse probability weighting estimators of values of policies for each stage. The approach consistently estimates the optimal DTR if either a propensity score or Q-function for each stage is consistently estimated. Furthermore, the resulting DTR can achieve the optimal convergence rate $n^{-1/2}$ of regret under mild conditions on the convergence rate for estimators of the nuisance parameters.
Related papers
- Stage-Aware Learning for Dynamic Treatments [3.6923632650826486]
We propose a novel individualized learning method for dynamic treatment regimes.
By relaxing the restriction that the observed trajectory must be fully aligned with the optimal treatments, our approach substantially improves the sample efficiency and stability of IPWE-based methods.
arXiv Detail & Related papers (2023-10-30T06:35:31Z) - Doubly Robust Proximal Causal Learning for Continuous Treatments [56.05592840537398]
We propose a kernel-based doubly robust causal learning estimator for continuous treatments.
We show that its oracle form is a consistent approximation of the influence function.
We then provide a comprehensive convergence analysis in terms of the mean square error.
arXiv Detail & Related papers (2023-09-22T12:18:53Z) - Efficient and robust transfer learning of optimal individualized
treatment regimes with right-censored survival data [7.308241944759317]
An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics.
We propose a doubly robust estimator of the value function, and the optimal ITR is learned by maximizing the value function within a pre-specified class of ITRs.
We evaluate the empirical performance of the proposed method by simulation studies and a real data application of sodium bicarbonate therapy for patients with severe metabolic acidaemia.
arXiv Detail & Related papers (2023-01-13T11:47:10Z) - TCFimt: Temporal Counterfactual Forecasting from Individual Multiple
Treatment Perspective [50.675845725806724]
We propose a comprehensive framework of temporal counterfactual forecasting from an individual multiple treatment perspective (TCFimt)
TCFimt constructs adversarial tasks in a seq2seq framework to alleviate selection and time-varying bias and designs a contrastive learning-based block to decouple a mixed treatment effect into separated main treatment effects and causal interactions.
The proposed method shows satisfactory performance in predicting future outcomes with specific treatments and in choosing optimal treatment type and timing than state-of-the-art methods.
arXiv Detail & Related papers (2022-12-17T15:01:05Z) - Disentangled Counterfactual Recurrent Networks for Treatment Effect
Inference over Time [71.30985926640659]
We introduce the Disentangled Counterfactual Recurrent Network (DCRN), a sequence-to-sequence architecture that estimates treatment outcomes over time.
With an architecture that is completely inspired by the causal structure of treatment influence over time, we advance forecast accuracy and disease understanding.
We demonstrate that DCRN outperforms current state-of-the-art methods in forecasting treatment responses, on both real and simulated data.
arXiv Detail & Related papers (2021-12-07T16:40:28Z) - Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints [0.0]
We study estimation of an optimal dynamic treatment regime that guides the optimal treatment assignment for each individual at each stage based on their history.
The paper proposes two estimation methods: one solves the treatment assignment problem sequentially through backward induction, and the other solves the entire problem simultaneously across all stages.
arXiv Detail & Related papers (2021-06-09T12:42:53Z) - Stochastic Optimization of Areas Under Precision-Recall Curves with
Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems.
We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z) - Evaluating (weighted) dynamic treatment effects by double machine
learning [0.12891210250935145]
We consider evaluating the causal effects of dynamic treatments in a data-driven way under a selection-on-observables assumption.
We make use of so-called Neyman-orthogonal score functions, which imply the robustness of treatment effect estimation to moderate (local) misspecifications.
We demonstrate that the estimators are regularityally normal and $sqrtn$-consistent under specific conditions.
arXiv Detail & Related papers (2020-12-01T09:55:40Z) - DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret [59.81290762273153]
Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions to an individual's initial features and to intermediate outcomes and features at each subsequent stage.
We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear.
arXiv Detail & Related papers (2020-05-06T13:03:42Z) - Estimating Counterfactual Treatment Outcomes over Time Through
Adversarially Balanced Representations [114.16762407465427]
We introduce the Counterfactual Recurrent Network (CRN) to estimate treatment effects over time.
CRN uses domain adversarial training to build balancing representations of the patient history.
We show how our model achieves lower error in estimating counterfactuals and in choosing the correct treatment and timing of treatment.
arXiv Detail & Related papers (2020-02-10T20:47:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.