DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
        - URL: http://arxiv.org/abs/2005.02791v3
- Date: Tue, 20 Sep 2022 20:59:54 GMT
- Title: DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
- Authors: Yichun Hu and Nathan Kallus
- Abstract summary: Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions to an individual's initial features and to intermediate outcomes and features at each subsequent stage.
We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear.
- Score: 59.81290762273153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage
treatment plans that adapt treatment decisions both to an individual's initial
features and to intermediate outcomes and features at each subsequent stage,
which are affected by decisions in prior stages. Examples include personalized
first- and second-line treatments of chronic conditions like diabetes, cancer,
and depression, which adapt to patient response to first-line treatment,
disease progression, and individual characteristics. While existing literature
mostly focuses on estimating the optimal DTR from offline data such as from
sequentially randomized trials, we study the problem of developing the optimal
DTR in an online manner, where the interaction with each individual affect both
our cumulative reward and our data collection for future learning. We term this
the DTR bandit problem. We propose a novel algorithm that, by carefully
balancing exploration and exploitation, is guaranteed to achieve rate-optimal
regret when the transition and reward models are linear. We demonstrate our
algorithm and its benefits both in synthetic experiments and in a case study of
adaptive treatment of major depressive disorder using real-world data.
 
      
        Related papers
        - Parameterized Diffusion Optimization enabled Autoregressive Ordinal   Regression for Diabetic Retinopathy Grading [53.11883409422728]
 This work proposes a novel autoregressive ordinal regression method called AOR-DR.<n>We decompose the diabetic retinopathy grading task into a series of ordered steps by fusing the prediction of the previous steps with extracted image features.<n>We exploit the diffusion process to facilitate conditional probability modeling, enabling the direct use of continuous global image features for autoregression.
 arXiv  Detail & Related papers  (2025-07-07T13:22:35Z)
- Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
 Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.
Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.
Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.
Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
 arXiv  Detail & Related papers  (2025-01-30T06:49:57Z)
- Robust Learning for Optimal Dynamic Treatment Regimes with Observational   Data [0.0]
 We study the statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's evolving history.
 arXiv  Detail & Related papers  (2024-03-30T02:33:39Z)
- TCFimt: Temporal Counterfactual Forecasting from Individual Multiple
  Treatment Perspective [50.675845725806724]
 We propose a comprehensive framework of temporal counterfactual forecasting from an individual multiple treatment perspective (TCFimt)
 TCFimt constructs adversarial tasks in a seq2seq framework to alleviate selection and time-varying bias and designs a contrastive learning-based block to decouple a mixed treatment effect into separated main treatment effects and causal interactions.
The proposed method shows satisfactory performance in predicting future outcomes with specific treatments and in choosing optimal treatment type and timing than state-of-the-art methods.
 arXiv  Detail & Related papers  (2022-12-17T15:01:05Z)
- Federated Offline Reinforcement Learning [55.326673977320574]
 We propose a multi-site Markov decision process model that allows for both homogeneous and heterogeneous effects across sites.
We design the first federated policy optimization algorithm for offline RL with sample complexity.
We give a theoretical guarantee for the proposed algorithm, where the suboptimality for the learned policies is comparable to the rate as if data is not distributed.
 arXiv  Detail & Related papers  (2022-06-11T18:03:26Z)
- Learning Optimal Dynamic Treatment Regimes Using Causal Tree Methods in
  Medicine [20.401805132360654]
 We develop two novel methods for learning optimal dynamic treatment regimes (DTRs)
Our methods are based on a data-driven estimation of heterogeneous treatment effects using causal tree methods.
We evaluate our proposed methods using synthetic data and then apply them to real-world data from intensive care units.
 arXiv  Detail & Related papers  (2022-04-14T17:27:08Z)
- Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach [0.0]
 Dynamic Treatment Regimes (DTRs) are widely studied to formalize this process.
We develop Reinforcement Learning methods to efficiently learn optimal treatment regimes.
 arXiv  Detail & Related papers  (2021-12-08T20:22:04Z)
- Disentangled Counterfactual Recurrent Networks for Treatment Effect
  Inference over Time [71.30985926640659]
 We introduce the Disentangled Counterfactual Recurrent Network (DCRN), a sequence-to-sequence architecture that estimates treatment outcomes over time.
With an architecture that is completely inspired by the causal structure of treatment influence over time, we advance forecast accuracy and disease understanding.
We demonstrate that DCRN outperforms current state-of-the-art methods in forecasting treatment responses, on both real and simulated data.
 arXiv  Detail & Related papers  (2021-12-07T16:40:28Z)
- Continuous Treatment Recommendation with Deep Survival Dose Response
  Function [3.705291460388999]
 We propose a general formulation for continuous treatment recommendation problems in settings with clinical survival data.
The estimated treatment effect from DeepSDRF enables us to develop recommender algorithms with the correction for selection bias.
This is the first time that causal models are used to address the continuous treatment effect with observational data in a medical context.
 arXiv  Detail & Related papers  (2021-08-24T00:19:04Z)
- DeepRite: Deep Recurrent Inverse TreatmEnt Weighting for Adjusting
  Time-varying Confounding in Modern Longitudinal Observational Data [68.29870617697532]
 We propose Deep Recurrent Inverse TreatmEnt weighting (DeepRite) for time-varying confounding in longitudinal data.
DeepRite is shown to recover the ground truth from synthetic data, and estimate unbiased treatment effects from real data.
 arXiv  Detail & Related papers  (2020-10-28T15:05:08Z)
- Provably Efficient Causal Reinforcement Learning with Confounded
  Observational Data [135.64775986546505]
 We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
 arXiv  Detail & Related papers  (2020-06-22T14:49:33Z)
- Multicategory Angle-based Learning for Estimating Optimal Dynamic
  Treatment Regimes with Censored Data [12.499787110182632]
 An optimal treatment regime (DTR) consists of a sequence of decision rules in maximizing long-term benefits.
In this paper, we develop a novel angle-based approach to target the optimal DTR under a multicategory treatment framework.
Our numerical studies show that the proposed method outperforms competing methods in terms of maximizing the conditional survival function.
 arXiv  Detail & Related papers  (2020-01-14T05:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.