Assessing the Impact of Context Inference Error and Partial
Observability on RL Methods for Just-In-Time Adaptive Interventions
- URL: http://arxiv.org/abs/2305.09913v1
- Date: Wed, 17 May 2023 02:46:37 GMT
- Title: Assessing the Impact of Context Inference Error and Partial
Observability on RL Methods for Just-In-Time Adaptive Interventions
- Authors: Karine Karine, Predrag Klasnja, Susan A. Murphy, Benjamin M. Marlin
- Abstract summary: Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community.
JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components.
We study the effect of context inference error and partial observability on the ability to learn effective policies.
- Score: 12.762365585427377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized
health interventions developed within the behavioral science community. JITAIs
aim to provide the right type and amount of support by iteratively selecting a
sequence of intervention options from a pre-defined set of components in
response to each individual's time varying state. In this work, we explore the
application of reinforcement learning methods to the problem of learning
intervention option selection policies. We study the effect of context
inference error and partial observability on the ability to learn effective
policies. Our results show that the propagation of uncertainty from context
inferences is critical to improving intervention efficacy as context
uncertainty increases, while policy gradient algorithms can provide remarkable
robustness to partially observed behavioral state information.
Related papers
- Reconciling Heterogeneous Effects in Causal Inference [44.99833362998488]
We apply the Reconcile algorithm for model multiplicity in machine learning to reconcile heterogeneous effects in causal inference.
Our results have tangible implications for ensuring fair outcomes in high-stakes such as healthcare, insurance, and housing.
arXiv Detail & Related papers (2024-06-05T18:43:46Z) - Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori.
In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty.
We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z) - Prescriptive Process Monitoring Under Resource Constraints: A
Reinforcement Learning Approach [0.3807314298073301]
Reinforcement learning has been put forward as an approach to learning intervention policies through trial and error.
Existing approaches in this space assume that the number of resources available to perform interventions in a process is unlimited.
This paper argues that, in the presence of resource constraints, a key dilemma in the field of prescriptive process monitoring is to trigger interventions based not only on predictions of their necessity, timeliness, or effect but also on the uncertainty of these predictions and the level of resource utilization.
arXiv Detail & Related papers (2023-07-13T05:31:40Z) - A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment.
We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy.
Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - Variance-Aware Off-Policy Evaluation with Linear Function Approximation [85.75516599931632]
We study the off-policy evaluation problem in reinforcement learning with linear function approximation.
We propose an algorithm, VA-OPE, which uses the estimated variance of the value function to reweight the Bellman residual in Fitted Q-Iteration.
arXiv Detail & Related papers (2021-06-22T17:58:46Z) - Stochastic Intervention for Causal Inference via Reinforcement Learning [7.015556609676951]
Central to causal inference is the treatment effect estimation of intervention strategies.
Existing methods are mostly restricted to the deterministic treatment and compare outcomes under different treatments.
We propose a new effective framework to estimate the treatment effect on intervention.
arXiv Detail & Related papers (2021-05-28T00:11:22Z) - Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients [54.98496284653234]
We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions.
We solve this problem by introducing a regularizer based on the mutual information between the sensitive state and the actions.
We develop a model-based estimator for optimization of privacy-constrained policies.
arXiv Detail & Related papers (2020-12-30T03:22:35Z) - Learning "What-if" Explanations for Sequential Decision-Making [92.8311073739295]
Building interpretable parameterizations of real-world decision-making on the basis of demonstrated behavior is essential.
We propose learning explanations of expert decisions by modeling their reward function in terms of preferences with respect to "what if" outcomes.
We highlight the effectiveness of our batch, counterfactual inverse reinforcement learning approach in recovering accurate and interpretable descriptions of behavior.
arXiv Detail & Related papers (2020-07-02T14:24:17Z) - Control Frequency Adaptation via Action Persistence in Batch
Reinforcement Learning [40.94323379769606]
We introduce the notion of action persistence that consists in the repetition of an action for a fixed number of decision steps.
We present a novel algorithm, Persistent Fitted Q-Iteration (PFQI), that extends FQI, with the goal of learning the optimal value function at a given persistence.
arXiv Detail & Related papers (2020-02-17T08:38:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.