Related papers: Causal Reinforcement Learning based Agent-Patient Interaction with Clinical Domain Knowledge

Causal Reinforcement Learning based Agent-Patient Interaction with Clinical Domain Knowledge

URL: http://arxiv.org/abs/2512.00048v1
Date: Mon, 17 Nov 2025 22:38:03 GMT
Title: Causal Reinforcement Learning based Agent-Patient Interaction with Clinical Domain Knowledge
Authors: Wenzheng Zhao, Ran Zhang, Ruth Palan Lopez, Shu-Fen Wung, Fengpei Yuan,
Abstract summary: Reinforcement Learning (RL) faces significant challenges in adaptive healthcare interventions, such as dementia care, where data is scarce.<n>We present a novel framework called Causal structure-aware Reinforcement Learning (CRL) that explicitly integrates causal discovery and reasoning into policy optimization.
Score: 3.2434118923825483
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement Learning (RL) faces significant challenges in adaptive healthcare interventions, such as dementia care, where data is scarce, decisions require interpretability, and underlying patient-state dynamic are complex and causal in nature. In this work, we present a novel framework called Causal structure-aware Reinforcement Learning (CRL) that explicitly integrates causal discovery and reasoning into policy optimization. This method enables an agent to learn and exploit a directed acyclic graph (DAG) that describes the causal dependencies between human behavioral states and robot actions, facilitating more efficient, interpretable, and robust decision-making. We validate our approach in a simulated robot-assisted cognitive care scenario, where the agent interacts with a virtual patient exhibiting dynamic emotional, cognitive, and engagement states. The experimental results show that CRL agents outperform conventional model-free RL baselines by achieving higher cumulative rewards, maintaining desirable patient states more consistently, and exhibiting interpretable, clinically-aligned behavior. We further demonstrate that CRL's performance advantage remains robust across different weighting strategies and hyperparameter settings. In addition, we demonstrate a lightweight LLM-based deployment: a fixed policy is embedded into a system prompt that maps inferred states to actions, producing consistent, supportive dialogue without LLM finetuning. Our work illustrates the promise of causal reinforcement learning for human-robot interaction applications, where interpretability, adaptiveness, and data efficiency are paramount.

Related papers

Beyond Prediction: Reinforcement Learning as the Defining Leap in Healthcare AI [38.11241251343041]
Reinforcement learning (RL) marks a fundamental shift in how artificial intelligence is applied in healthcare.<n>Instead of merely predicting outcomes, RL actively decides interventions with long term goals.<n>This paper explores RL's rise in healthcare as more than a set of tools, rather a shift toward agentive intelligence in clinical environments.
arXiv Detail & Related papers (2025-08-28T07:05:24Z)
Dynamic Programming Techniques for Enhancing Cognitive Representation in Knowledge Tracing [125.75923987618977]
We propose the Cognitive Representation Dynamic Programming based Knowledge Tracing (CRDP-KT) model.<n>It is a dynamic programming algorithm to optimize cognitive representations based on the difficulty of the questions and the performance intervals between them.<n>It provides more accurate and systematic input features for subsequent model training, thereby minimizing distortion in the simulation of cognitive states.
arXiv Detail & Related papers (2025-06-03T14:44:48Z)
Towards Empowerment Gain through Causal Structure Learning in Model-Based RL [35.933469787075]
We propose a novel framework, Empowerment through Causal Learning (ECL), to improve learning efficiency and controllability.<n>ECL operates by first training a causal dynamics model of the environment based on collected data.<n>We then maximize empowerment under the causal structure for exploration, simultaneously using data gathered through exploration to update causal dynamics model to be more controllable.
arXiv Detail & Related papers (2025-02-14T10:59:09Z)
Integrating Reinforcement Learning and AI Agents for Adaptive Robotic Interaction and Assistance in Dementia Care [5.749791442522375]
This study explores a novel approach to advancing dementia care by integrating socially assistive robotics, reinforcement learning (RL), large language models (LLMs), and clinical domain expertise within a simulated environment.
arXiv Detail & Related papers (2025-01-28T06:38:24Z)
CauSkelNet: Causal Representation Learning for Human Behaviour Analysis [7.139285159330364]
This study introduces a novel representation learning framework based on causal inference to address these challenges.<n>Our two-stage approach combines the Peter-Clark (PC) algorithm and Kullback-Leibler (KL) divergence to identify and quantify causal relationships between human joints.<n>By capturing joint interactions, the proposed causal Graph Convolutional Network (GCN) produces interpretable and robust representations.
arXiv Detail & Related papers (2024-09-23T21:38:49Z)
Large Language Model-based Human-Agent Collaboration for Complex Task Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving. We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC. This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation [50.01551945190676]
Social robot navigation can be helpful in various contexts of daily life but requires safe human-robot interactions and efficient trajectory planning. We propose a systematic relational reasoning approach with explicit inference of the underlying dynamically evolving relational structures. We demonstrate its effectiveness for multi-agent trajectory prediction and social robot navigation.
arXiv Detail & Related papers (2024-01-22T18:58:22Z)
Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework. These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z)
Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph. We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs. The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z)
Causal Reinforcement Learning using Observational and Interventional Data [14.856472820492364]
Learning efficiently a causal model of the environment is a key challenge of model RL agents operating in POMDPs. We consider a scenario where the learning agent has the ability to collect online experiences through direct interactions with the environment. We then ask the following questions: can the online and offline experiences be safely combined for learning a causal model.
arXiv Detail & Related papers (2021-06-28T06:58:20Z)
Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation [15.451690870640295]
In some scenarios such as healthcare, usually only few records are available for each patient, impeding the application of currentReinforcement learning algorithms. We propose a data-efficient RL algorithm that exploits structural causal models (SCMs) to model the state dynamics. We show that counterfactual outcomes are identifiable under mild conditions and that Q- learning on the counterfactual-based augmented data set converges to the optimal value function.
arXiv Detail & Related papers (2020-12-16T17:21:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.