Related papers: Enhancing Adaptive Behavioral Interventions with LLM Inference from Participant-Described States

Enhancing Adaptive Behavioral Interventions with LLM Inference from Participant-Described States

URL: http://arxiv.org/abs/2507.03871v1
Date: Sat, 05 Jul 2025 02:52:51 GMT
Title: Enhancing Adaptive Behavioral Interventions with LLM Inference from Participant-Described States
Authors: Karine Karine, Benjamin M. Marlin,
Abstract summary: We develop a novel physical activity intervention simulation environment that generates text-based state descriptions conditioned on latent state variables.<n>We show that this approach has the potential to significantly improve the performance of online policy learning methods.
Score: 9.395236804312496
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The use of reinforcement learning (RL) methods to support health behavior change via personalized and just-in-time adaptive interventions is of significant interest to health and behavioral science researchers focused on problems such as smoking cessation support and physical activity promotion. However, RL methods are often applied to these domains using a small collection of context variables to mitigate the significant data scarcity issues that arise from practical limitations on the design of adaptive intervention trials. In this paper, we explore an approach to significantly expanding the state space of an adaptive intervention without impacting data efficiency. The proposed approach enables intervention participants to provide natural language descriptions of aspects of their current state. It then leverages inference with pre-trained large language models (LLMs) to better align the policy of a base RL method with these state descriptions. To evaluate our method, we develop a novel physical activity intervention simulation environment that generates text-based state descriptions conditioned on latent state variables using an auxiliary LLM. We show that this approach has the potential to significantly improve the performance of online policy learning methods.

Related papers

Policy Learning with a Natural Language Action Space: A Causal Approach [24.096991077437146]
This paper introduces a novel causal framework for multi-stage decision-making in natural language action spaces.<n>Our approach employs Q-learning to estimate Dynamic Treatment Regimes (DTR) through a single model.<n>A key technical contribution of our approach is a decoding strategy that translates optimized embeddings back into coherent natural language.
arXiv Detail & Related papers (2025-02-24T17:26:07Z)
Task-driven Layerwise Additive Activation Intervention [12.152228552335798]
Modern language models (LMs) have significantly advanced generative modeling in natural language processing (NLP)<n>This paper proposes a layer-wise additive activation intervention framework that optimize the intervention process.<n>We benchmark our framework on various datasets, demonstrating improvements in the accuracy of pre-trained LMs and competing intervention baselines.
arXiv Detail & Related papers (2025-02-10T02:49:46Z)
StepCountJITAI: simulation environment for RL with application to physical activity adaptive intervention [9.395236804312496]
We introduce StepCountJITAI, an RL environment designed to foster research on RL methods. In this paper, we introduce StepCountJITAI, an RL environment designed to foster research on RL methods for adaptive behavioral interventions.
arXiv Detail & Related papers (2024-11-01T03:31:39Z)
Estimating Causal Effects of Text Interventions Leveraging LLMs [7.2937547395453315]
CausalDANN is a novel approach to estimate causal effects using text transformations facilitated by large language models (LLMs)<n>Unlike existing methods, our approach accommodates arbitrary textual interventions and leverages text-level classifiers with domain adaptation ability to produce robust effect estimates against domain shifts.<n>This flexibility in handling various text interventions is a key advancement in causal estimation for textual data, offering opportunities to better understand human behaviors and develop effective interventions within social systems.
arXiv Detail & Related papers (2024-10-28T19:19:35Z)
Assessing the Impact of Context Inference Error and Partial Observability on RL Methods for Just-In-Time Adaptive Interventions [12.762365585427377]
Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components. We study the effect of context inference error and partial observability on the ability to learn effective policies.
arXiv Detail & Related papers (2023-05-17T02:46:37Z)
A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment. We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy. Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z)
Scalable Bayesian Inverse Reinforcement Learning [93.27920030279586]
We introduce Approximate Variational Reward Imitation Learning (AVRIL) Our method addresses the ill-posed nature of the inverse reinforcement learning problem. Applying our method to real medical data alongside classic control simulations, we demonstrate Bayesian reward inference in environments beyond the scope of current methods.
arXiv Detail & Related papers (2021-02-12T12:32:02Z)
Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients [54.98496284653234]
We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We solve this problem by introducing a regularizer based on the mutual information between the sensitive state and the actions. We develop a model-based estimator for optimization of privacy-constrained policies.
arXiv Detail & Related papers (2020-12-30T03:22:35Z)
Strictly Batch Imitation Learning by Energy-based Distribution Matching [104.33286163090179]
Consider learning a policy purely on the basis of demonstrated behavior -- that is, with no access to reinforcement signals, no knowledge of transition dynamics, and no further interaction with the environment. One solution is simply to retrofit existing algorithms for apprenticeship learning to work in the offline setting. But such an approach leans heavily on off-policy evaluation or offline model estimation, and can be indirect and inefficient. We argue that a good solution should be able to explicitly parameterize a policy, implicitly learn from rollout dynamics, and operate in an entirely offline fashion.
arXiv Detail & Related papers (2020-06-25T03:27:59Z)
Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension. We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions [48.91284724066349]
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education. Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding. We develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates.
arXiv Detail & Related papers (2020-02-10T00:26:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.