Treatment recommendation with distributional targets
- URL: http://arxiv.org/abs/2005.09717v4
- Date: Tue, 5 Apr 2022 09:52:37 GMT
- Title: Treatment recommendation with distributional targets
- Authors: Anders Bredahl Kock and David Preinerstorfer and Bezirgen Veliyev
- Abstract summary: We study the problem of a decision maker who must provide the best possible treatment recommendation based on an experiment.
The desirability of the outcome distribution resulting from the treatment recommendation is measured through a functional distributional characteristic.
We propose two (near)-optimal regret policies.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of a decision maker who must provide the best possible
treatment recommendation based on an experiment. The desirability of the
outcome distribution resulting from the policy recommendation is measured
through a functional capturing the distributional characteristic that the
decision maker is interested in optimizing. This could be, e.g., its inherent
inequality, welfare, level of poverty or its distance to a desired outcome
distribution. If the functional of interest is not quasi-convex or if there are
constraints, the optimal recommendation may be a mixture of treatments. This
vastly expands the set of recommendations that must be considered. We
characterize the difficulty of the problem by obtaining maximal expected regret
lower bounds. Furthermore, we propose two (near) regret-optimal policies. The
first policy is static and thus applicable irrespectively of subjects arriving
sequentially or not in the course of the experimentation phase. The second
policy can utilize that subjects arrive sequentially by successively
eliminating inferior treatments and thus spends the sampling effort where it is
most needed.
Related papers
- Are causal effect estimations enough for optimal recommendations under multitreatment scenarios? [2.4578723416255754]
It is essential to include a causal effect estimation analysis to compare potential outcomes under different treatments or controls.
We propose a comprehensive methodology for multitreatment selection.
arXiv Detail & Related papers (2024-10-07T16:37:35Z) - Experiment Planning with Function Approximation [49.50254688629728]
We study the problem of experiment planning with function approximation in contextual bandit problems.
We propose two experiment planning strategies compatible with function approximation.
We show that a uniform sampler achieves competitive optimality rates in the setting where the number of actions is small.
arXiv Detail & Related papers (2024-01-10T14:40:23Z) - Policy Learning with Distributional Welfare [1.0742675209112622]
Most literature on treatment choice has considered utilitarian welfare based on the conditional average treatment effect (ATE)
This paper proposes an optimal policy that allocates the treatment based on the conditional quantile of individual treatment effects (QoTE)
arXiv Detail & Related papers (2023-11-27T14:51:30Z) - Optimal and Fair Encouragement Policy Evaluation and Learning [11.712023983596914]
We study causal identification, statistical variance-reduced estimation, and robust estimation of optimal treatment rules.
We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds.
arXiv Detail & Related papers (2023-09-12T20:45:30Z) - Provable Offline Preference-Based Reinforcement Learning [95.00042541409901]
We investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback.
We consider the general reward setting where the reward can be defined over the whole trajectory.
We introduce a new single-policy concentrability coefficient, which can be upper bounded by the per-trajectory concentrability.
arXiv Detail & Related papers (2023-05-24T07:11:26Z) - Optimal Treatment Regimes for Proximal Causal Learning [7.672587258250301]
We propose a novel optimal individualized treatment regime based on outcome and treatment confounding bridges.
We show that the value function of this new optimal treatment regime is superior to that of existing ones in the literature.
arXiv Detail & Related papers (2022-12-19T14:29:25Z) - CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies [62.39667564455059]
We consider and study a distribution of optimal policies.
In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems.
We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability.
arXiv Detail & Related papers (2022-05-19T09:48:56Z) - Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response.
We construct unbiased estimators for the policy-dependent estimand by a perturbation method.
We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z) - Understanding the Effect of Stochasticity in Policy Optimization [86.7574122154668]
We show that the preferability of optimization methods depends critically on whether exact gradients are used.
Second, to explain these findings we introduce the concept of committal rate for policy optimization.
Third, we show that in the absence of external oracle information, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely.
arXiv Detail & Related papers (2021-10-29T06:35:44Z) - Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints [0.0]
We study estimation of an optimal dynamic treatment regime that guides the optimal treatment assignment for each individual at each stage based on their history.
The paper proposes two estimation methods: one solves the treatment assignment problem sequentially through backward induction, and the other solves the entire problem simultaneously across all stages.
arXiv Detail & Related papers (2021-06-09T12:42:53Z) - Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds
Globally Optimal Policy [95.98698822755227]
We make the first attempt to study risk-sensitive deep reinforcement learning under the average reward setting with the variance risk criteria.
We propose an actor-critic algorithm that iteratively and efficiently updates the policy, the Lagrange multiplier, and the Fenchel dual variable.
arXiv Detail & Related papers (2020-12-28T05:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.