Related papers: Unifying Causal Reinforcement Learning: Survey, Taxonomy, Algorithms and Applications

Unifying Causal Reinforcement Learning: Survey, Taxonomy, Algorithms and Applications

URL: http://arxiv.org/abs/2512.18135v1
Date: Fri, 19 Dec 2025 23:37:22 GMT
Title: Unifying Causal Reinforcement Learning: Survey, Taxonomy, Algorithms and Applications
Authors: Cristiano da Costa Cunha, Wei Liu, Tim French, Ajmal Mian,
Abstract summary: Causal reinforcement learning (CRL) offers promising solutions to challenges by explicitly modeling cause-and-effect relationships.<n>We categorize existing approaches into causal representation learning, counterfactual policy optimization, offline causal RL, causal transfer learning, and causal explainability.<n>We provide future research directions, underscoring the potential of CRL for developing robust, generalizable, and interpretable artificial intelligence systems.
Score: 35.74838344207327
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Integrating causal inference (CI) with reinforcement learning (RL) has emerged as a powerful paradigm to address critical limitations in classical RL, including low explainability, lack of robustness and generalization failures. Traditional RL techniques, which typically rely on correlation-driven decision-making, struggle when faced with distribution shifts, confounding variables, and dynamic environments. Causal reinforcement learning (CRL), leveraging the foundational principles of causal inference, offers promising solutions to these challenges by explicitly modeling cause-and-effect relationships. In this survey, we systematically review recent advancements at the intersection of causal inference and RL. We categorize existing approaches into causal representation learning, counterfactual policy optimization, offline causal RL, causal transfer learning, and causal explainability. Through this structured analysis, we identify prevailing challenges, highlight empirical successes in practical applications, and discuss open problems. Finally, we provide future research directions, underscoring the potential of CRL for developing robust, generalizable, and interpretable artificial intelligence systems.

Related papers

Learning Structured Reasoning via Tractable Trajectory Control [99.75278337895024]
Ctrl-R is a framework for learning structured reasoning via tractable trajectory control.<n>We show that Ctrl-R enables effective exploration and internalization of previously unattainable reasoning patterns.
arXiv Detail & Related papers (2026-03-02T09:18:19Z)
How and Why LLMs Generalize: A Fine-Grained Analysis of LLM Reasoning from Cognitive Behaviors to Low-Level Patterns [51.02752099869218]
Large Language Models (LLMs) display strikingly different generalization behaviors.<n>We introduce a novel benchmark that decomposes reasoning into atomic core skills.<n>We show that RL-tuned models maintain more stable behavioral profiles and resist collapse in reasoning skills, whereas SFT models exhibit sharper drift and overfit to surface patterns.
arXiv Detail & Related papers (2025-12-30T08:16:20Z)
RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization [111.1749164063616]
We propose RL-PLUS, a novel hybrid-policy optimization approach for Large Language Models (LLMs)<n> RL-PLUS synergizes internal exploitation with external data to achieve stronger reasoning capabilities and surpass the boundaries of base models.<n>We provide both theoretical analysis and extensive experiments to demonstrate the superiority and generalizability of our approach.
arXiv Detail & Related papers (2025-07-31T23:55:29Z)
Learning Nonlinear Causal Reductions to Explain Reinforcement Learning Policies [50.30741668990102]
We take a causal perspective on explaining the behavior of reinforcement learning policies.<n>We learn a simplified high-level causal model that explains these relationships.<n>We prove that for a class of nonlinear causal models, there exists a unique solution.
arXiv Detail & Related papers (2025-07-20T10:25:24Z)
Statistical and Algorithmic Foundations of Reinforcement Learning [45.707617428078585]
sequential learning (RL) has received a flurry of attention in recent years.<n>We aim to introduce several important developments in RL, highlighting the connections between new ideas classical topics.
arXiv Detail & Related papers (2025-07-19T02:42:41Z)
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities [62.05713042908654]
This paper provides a review of advances in Large Language Models (LLMs) alignment through the lens of inverse reinforcement learning (IRL)<n>We highlight the necessity of constructing neural reward models from human data and discuss the formal and practical implications of this paradigm shift.
arXiv Detail & Related papers (2025-07-17T14:22:24Z)
Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning [93.00629872970364]
Reinforcement learning (RL) has become the dominant paradigm for improving the performance of language models on complex reasoning tasks.<n>We introduce SPARKLE, a fine-grained analytic framework to dissect the effects of RL across three key dimensions.<n>We study whether difficult problems -- those yielding no RL signals and mixed-quality reasoning traces -- can still be effectively used for training.
arXiv Detail & Related papers (2025-06-05T07:53:59Z)
Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review [43.27239522837257]
This paper is devoted to a comprehensive review of realizing the sample efficiency and generalization of RL algorithms through transfer and inverse reinforcement learning (T-IRL)<n>Our findings denote that a majority of recent research works have dealt with the aforementioned challenges by utilizing human-in-the-loop and sim-to-real strategies.<n>Under the IRL structure, training schemes that require a low number of experience transitions and extension of such frameworks to multi-agent and multi-intention problems have been the priority of researchers in recent years.
arXiv Detail & Related papers (2024-11-15T15:18:57Z)
Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy [38.86867078596718]
We consider explicitly modeling the generation process of states with the graphical causal model.<n>We formulate the causal structure updating into the RL interaction process with active intervention learning of the environment.
arXiv Detail & Related papers (2024-02-07T14:09:34Z)
Reinforcement Learning with Knowledge Representation and Reasoning: A Brief Survey [10.621186518750976]
Reinforcement Learning (RL) has achieved tremendous development in recent years, but still faces significant obstacles in addressing complex real-life problems.<n>Recently, there has been a rapidly growing interest in the use of Knowledge Representation and Reasoning (KRR) methods to enable more abstract representation and efficient learning in RL.
arXiv Detail & Related papers (2023-04-24T13:35:11Z)
A Survey on Causal Reinforcement Learning [41.645270300009436]
We offer a review of Causal Reinforcement Learning (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL. In particular, we divide existing CRL approaches into two categories according to whether their causality-based information is given in advance or not. We analyze each category in terms of the formalization of different models, ranging from the Markov Decision Process (MDP), Partially Observed Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment Regime (DTR)
arXiv Detail & Related papers (2023-02-10T12:25:08Z)
Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning [24.09547181095033]
Causal Graph is a structure built upon the relation between objects and events. We propose a framework with theoretical performance guarantees that alternates between two steps. Our performance improvement is attributed to the virtuous cycle of causal discovery, transition modeling, and policy training.
arXiv Detail & Related papers (2022-07-19T05:31:16Z)
Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner. We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z)
Causal Inference Q-Network: Toward Resilient Reinforcement Learning [57.96312207429202]
We consider a resilient DRL framework with observational interferences. Under this framework, we propose a causal inference based DRL algorithm called causal inference Q-network (CIQ) Our experimental results show that the proposed CIQ method could achieve higher performance and more resilience against observational interferences.
arXiv Detail & Related papers (2021-02-18T23:50:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.