Estimating and Improving Dynamic Treatment Regimes With a Time-Varying
Instrumental Variable
- URL: http://arxiv.org/abs/2104.07822v1
- Date: Thu, 15 Apr 2021 23:44:39 GMT
- Title: Estimating and Improving Dynamic Treatment Regimes With a Time-Varying
Instrumental Variable
- Authors: Shuxiao Chen, Bo Zhang
- Abstract summary: Esting dynamic treatment regimes (DTRs) from retrospective observational data is challenging as some degree of unmeasured confounding is often expected.
We develop a framework of estimating properly defined "optimal"
DTRs that are guaranteed to perform no worse and potentially better than a pre-specified baseline.
- Score: 9.680527191968409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating dynamic treatment regimes (DTRs) from retrospective observational
data is challenging as some degree of unmeasured confounding is often expected.
In this work, we develop a framework of estimating properly defined "optimal"
DTRs with a time-varying instrumental variable (IV) when unmeasured covariates
confound the treatment and outcome, rendering the potential outcome
distributions only partially identified. We derive a novel Bellman equation
under partial identification, use it to define a generic class of estimands
(termed IV-optimal DTRs), and study the associated estimation problem. We then
extend the IV-optimality framework to tackle the policy improvement problem,
delivering IV-improved DTRs that are guaranteed to perform no worse and
potentially better than a pre-specified baseline DTR. Importantly, our
IV-improvement framework opens up the possibility of strictly improving upon
DTRs that are optimal under the no unmeasured confounding assumption (NUCA). We
demonstrate via extensive simulations the superior performance of IV-optimal
and IV-improved DTRs over the DTRs that are optimal only under the NUCA. In a
real data example, we embed retrospective observational registry data into a
natural, two-stage experiment with noncompliance using a time-varying IV and
estimate useful IV-optimal DTRs that assign mothers to high-level or low-level
neonatal intensive care units based on their prognostic variables.
Related papers
- Offline Behavior Distillation [57.6900189406964]
Massive reinforcement learning (RL) data are typically collected to train policies offline without the need for interactions.
We formulate offline behavior distillation (OBD), which synthesizes limited expert behavioral data from sub-optimal RL data.
We propose two naive OBD objectives, DBC and PBC, which measure distillation performance via the decision difference between policies trained on distilled data and either offline data or a near-expert policy.
arXiv Detail & Related papers (2024-10-30T06:28:09Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data [0.0]
We study statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's history.
We propose a step-wise doubly-robust approach to learn the optimal DTR using observational data under the assumption of sequential ignorability.
arXiv Detail & Related papers (2024-03-30T02:33:39Z) - Efficient and robust transfer learning of optimal individualized
treatment regimes with right-censored survival data [7.308241944759317]
An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics.
We propose a doubly robust estimator of the value function, and the optimal ITR is learned by maximizing the value function within a pre-specified class of ITRs.
We evaluate the empirical performance of the proposed method by simulation studies and a real data application of sodium bicarbonate therapy for patients with severe metabolic acidaemia.
arXiv Detail & Related papers (2023-01-13T11:47:10Z) - Estimating individual treatment effects under unobserved confounding
using binary instruments [21.563820572163337]
Estimating individual treatment effects (ITEs) from observational data is relevant in many fields such as personalized medicine.
We propose a novel, multiply robust machine learning framework, called MRIV, for estimating ITEs using binary IVs.
arXiv Detail & Related papers (2022-08-17T21:25:09Z) - Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning [59.02006924867438]
Off-policy evaluation and learning (OPE/L) use offline observational data to make better decisions.
Recent work proposed distributionally robust OPE/L (DROPE/L) to remedy this, but the proposal relies on inverse-propensity weighting.
We propose the first DR algorithms for DROPE/L with KL-divergence uncertainty sets.
arXiv Detail & Related papers (2022-02-19T20:00:44Z) - Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach [0.0]
Dynamic Treatment Regimes (DTRs) are widely studied to formalize this process.
We develop Reinforcement Learning methods to efficiently learn optimal treatment regimes.
arXiv Detail & Related papers (2021-12-08T20:22:04Z) - Improving Inference from Simple Instruments through Compliance
Estimation [0.0]
Instrumental variables (IV) regression is widely used to estimate causal treatment effects in settings where receipt of treatment is not fully random.
While IV can recover consistent treatment effect estimates, they are often noisy.
We study how to improve the efficiency of IV estimates by exploiting the predictable variation in the strength of the instrument.
arXiv Detail & Related papers (2021-08-08T20:18:34Z) - Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning [107.70165026669308]
In offline reinforcement learning (RL) an optimal policy is learned solely from a priori collected observational data.
We study a confounded Markov decision process where the transition dynamics admit an additive nonlinear functional form.
We propose a provably efficient IV-aided Value Iteration (IVVI) algorithm based on a primal-dual reformulation of the conditional moment restriction.
arXiv Detail & Related papers (2021-02-19T13:01:40Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z) - DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret [59.81290762273153]
Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions to an individual's initial features and to intermediate outcomes and features at each subsequent stage.
We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear.
arXiv Detail & Related papers (2020-05-06T13:03:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.