Fusing Individualized Treatment Rules Using Secondary Outcomes
- URL: http://arxiv.org/abs/2402.08828v3
- Date: Sat, 9 Mar 2024 16:59:15 GMT
- Title: Fusing Individualized Treatment Rules Using Secondary Outcomes
- Authors: Daiqi Gao, Yuanjia Wang, Donglin Zeng
- Abstract summary: We learn an ITR that not only maximizes the value function for the primary outcome, but also approximates the optimal rule for the secondary outcomes.
Two algorithms are proposed to estimate the ITR using surrogate loss functions.
We prove that the agreement rate between the estimated ITR of the primary outcome and the optimal ITRs of the secondary outcomes converges to the true agreement rate faster than if the secondary outcomes are not taken into consideration.
- Score: 7.657053163626398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An individualized treatment rule (ITR) is a decision rule that recommends
treatments for patients based on their individual feature variables. In many
practices, the ideal ITR for the primary outcome is also expected to cause
minimal harm to other secondary outcomes. Therefore, our objective is to learn
an ITR that not only maximizes the value function for the primary outcome, but
also approximates the optimal rule for the secondary outcomes as closely as
possible. To achieve this goal, we introduce a fusion penalty to encourage the
ITRs based on different outcomes to yield similar recommendations. Two
algorithms are proposed to estimate the ITR using surrogate loss functions. We
prove that the agreement rate between the estimated ITR of the primary outcome
and the optimal ITRs of the secondary outcomes converges to the true agreement
rate faster than if the secondary outcomes are not taken into consideration.
Furthermore, we derive the non-asymptotic properties of the value function and
misclassification rate for the proposed method. Finally, simulation studies and
a real data example are used to demonstrate the finite-sample performance of
the proposed method.
Related papers
- Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori.
In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty.
We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z) - Estimating the Hessian Matrix of Ranking Objectives for Stochastic Learning to Rank with Gradient Boosted Trees [63.18324983384337]
We introduce the first learning to rank method for Gradient Boosted Decision Trees (GBDTs)
Our main contribution is a novel estimator for the second-order derivatives, i.e., the Hessian matrix.
We incorporate our estimator into the existing PL-Rank framework, which was originally designed for first-order derivatives only.
arXiv Detail & Related papers (2024-04-18T13:53:32Z) - Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data [0.0]
We study the statistical learning of optimal dynamic treatment regimes (DTRs) that guide the optimal treatment assignment for each individual at each stage based on the individual's evolving history.
arXiv Detail & Related papers (2024-03-30T02:33:39Z) - Individualized Policy Evaluation and Learning under Clustered Network
Interference [4.560284382063488]
We consider the problem of evaluating and learning an optimal individualized treatment rule under clustered network interference.
We propose an estimator that can be used to evaluate the empirical performance of an ITR.
We derive the finite-sample regret bound for a learned ITR, showing that the use of our efficient evaluation estimator leads to the improved performance of learned policies.
arXiv Detail & Related papers (2023-11-04T17:58:24Z) - Improved Policy Evaluation for Randomized Trials of Algorithmic Resource
Allocation [54.72195809248172]
We present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT.
We prove theoretically that such an estimator is more accurate than common estimators based on sample means.
arXiv Detail & Related papers (2023-02-06T05:17:22Z) - Efficient and robust transfer learning of optimal individualized
treatment regimes with right-censored survival data [7.308241944759317]
An individualized treatment regime (ITR) is a decision rule that assigns treatments based on patients' characteristics.
We propose a doubly robust estimator of the value function, and the optimal ITR is learned by maximizing the value function within a pre-specified class of ITRs.
We evaluate the empirical performance of the proposed method by simulation studies and a real data application of sodium bicarbonate therapy for patients with severe metabolic acidaemia.
arXiv Detail & Related papers (2023-01-13T11:47:10Z) - Optimal Treatment Regimes for Proximal Causal Learning [7.672587258250301]
We propose a novel optimal individualized treatment regime based on outcome and treatment confounding bridges.
We show that the value function of this new optimal treatment regime is superior to that of existing ones in the literature.
arXiv Detail & Related papers (2022-12-19T14:29:25Z) - When AUC meets DRO: Optimizing Partial AUC for Deep Learning with
Non-Convex Convergence Guarantee [51.527543027813344]
We propose systematic and efficient gradient-based methods for both one-way and two-way partial AUC (pAUC)
For both one-way and two-way pAUC, we propose two algorithms and prove their convergence for optimizing their two formulations, respectively.
arXiv Detail & Related papers (2022-03-01T01:59:53Z) - Jump Interval-Learning for Individualized Decision Making [21.891586204541877]
We propose a jump interval-learning to develop an individualized interval-valued decision rule (I2DR)
Unlike IDRs that recommend a single treatment, the proposed I2DR yields an interval of treatment options for each individual.
arXiv Detail & Related papers (2021-11-17T03:29:59Z) - False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.