Related papers: Treatment recommendation with distributional targets

Treatment recommendation with distributional targets

URL: http://arxiv.org/abs/2005.09717v4
Date: Tue, 5 Apr 2022 09:52:37 GMT
Title: Treatment recommendation with distributional targets
Authors: Anders Bredahl Kock and David Preinerstorfer and Bezirgen Veliyev
Abstract summary: We study the problem of a decision maker who must provide the best possible treatment recommendation based on an experiment. The desirability of the outcome distribution resulting from the treatment recommendation is measured through a functional distributional characteristic. We propose two (near)-optimal regret policies.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of a decision maker who must provide the best possible treatment recommendation based on an experiment. The desirability of the outcome distribution resulting from the policy recommendation is measured through a functional capturing the distributional characteristic that the decision maker is interested in optimizing. This could be, e.g., its inherent inequality, welfare, level of poverty or its distance to a desired outcome distribution. If the functional of interest is not quasi-convex or if there are constraints, the optimal recommendation may be a mixture of treatments. This vastly expands the set of recommendations that must be considered. We characterize the difficulty of the problem by obtaining maximal expected regret lower bounds. Furthermore, we propose two (near) regret-optimal policies. The first policy is static and thus applicable irrespectively of subjects arriving sequentially or not in the course of the experimentation phase. The second policy can utilize that subjects arrive sequentially by successively eliminating inferior treatments and thus spends the sampling effort where it is most needed.

Related papers

Are causal effect estimations enough for optimal recommendations under multitreatment scenarios? [2.4578723416255754]
It is essential to include a causal effect estimation analysis to compare potential outcomes under different treatments or controls. We propose a comprehensive methodology for multitreatment selection.
arXiv Detail & Related papers (2024-10-07T16:37:35Z)
Experiment Planning with Function Approximation [49.50254688629728]
We study the problem of experiment planning with function approximation in contextual bandit problems. We propose two experiment planning strategies compatible with function approximation. We show that a uniform sampler achieves competitive optimality rates in the setting where the number of actions is small.
arXiv Detail & Related papers (2024-01-10T14:40:23Z)
Policy Learning with Distributional Welfare [1.0742675209112622]
Most literature on treatment choice has considered utilitarian welfare based on the conditional average treatment effect (ATE) This paper proposes an optimal policy that allocates the treatment based on the conditional quantile of individual treatment effects (QoTE)
arXiv Detail & Related papers (2023-11-27T14:51:30Z)
Optimal and Fair Encouragement Policy Evaluation and Learning [11.712023983596914]
We study causal identification and robust estimation of optimal treatment rules, including under potential violations of positivity. We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds. We illustrate the methods in three case studies based on data from reminders of SNAP benefits, randomized encouragement to enroll in insurance, and from pretrial supervised release with electronic monitoring.
arXiv Detail & Related papers (2023-09-12T20:45:30Z)
Provable Offline Preference-Based Reinforcement Learning [95.00042541409901]
We investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback. We consider the general reward setting where the reward can be defined over the whole trajectory. We introduce a new single-policy concentrability coefficient, which can be upper bounded by the per-trajectory concentrability.
arXiv Detail & Related papers (2023-05-24T07:11:26Z)
Optimal Treatment Regimes for Proximal Causal Learning [7.672587258250301]
We propose a novel optimal individualized treatment regime based on outcome and treatment confounding bridges. We show that the value function of this new optimal treatment regime is superior to that of existing ones in the literature.
arXiv Detail & Related papers (2022-12-19T14:29:25Z)
CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies [62.39667564455059]
We consider and study a distribution of optimal policies. In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems. We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability.
arXiv Detail & Related papers (2022-05-19T09:48:56Z)
Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response. We construct unbiased estimators for the policy-dependent estimand by a perturbation method. We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z)
Understanding the Effect of Stochasticity in Policy Optimization [86.7574122154668]
We show that the preferability of optimization methods depends critically on whether exact gradients are used. Second, to explain these findings we introduce the concept of committal rate for policy optimization. Third, we show that in the absence of external oracle information, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely.
arXiv Detail & Related papers (2021-10-29T06:35:44Z)
Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints [0.0]
We study estimation of an optimal dynamic treatment regime that guides the optimal treatment assignment for each individual at each stage based on their history. The paper proposes two estimation methods: one solves the treatment assignment problem sequentially through backward induction, and the other solves the entire problem simultaneously across all stages.
arXiv Detail & Related papers (2021-06-09T12:42:53Z)
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy [95.98698822755227]
We make the first attempt to study risk-sensitive deep reinforcement learning under the average reward setting with the variance risk criteria. We propose an actor-critic algorithm that iteratively and efficiently updates the policy, the Lagrange multiplier, and the Fenchel dual variable.
arXiv Detail & Related papers (2020-12-28T05:02:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.