Related papers: Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

URL: http://arxiv.org/abs/2207.09081v6
Date: Wed, 17 May 2023 16:29:43 GMT
Title: Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning
Authors: Wenhao Ding, Haohong Lin, Bo Li, Ding Zhao
Abstract summary: Causal Graph is a structure built upon the relation between objects and events. We propose a framework with theoretical performance guarantees that alternates between two steps. Our performance improvement is attributed to the virtuous cycle of causal discovery, transition modeling, and policy training.
Score: 24.09547181095033
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As a pivotal component to attaining generalizable solutions in human intelligence, reasoning provides great potential for reinforcement learning (RL) agents' generalization towards varied goals by summarizing part-to-whole arguments and discovering cause-and-effect relations. However, how to discover and represent causalities remains a huge gap that hinders the development of causal RL. In this paper, we augment Goal-Conditioned RL (GCRL) with Causal Graph (CG), a structure built upon the relation between objects and events. We novelly formulate the GCRL problem into variational likelihood maximization with CG as latent variables. To optimize the derived objective, we propose a framework with theoretical performance guarantees that alternates between two steps: using interventional data to estimate the posterior of CG; using CG to learn generalizable models and interpretable policies. Due to the lack of public benchmarks that verify generalization capability under reasoning, we design nine tasks and then empirically show the effectiveness of the proposed method against five baselines on these tasks. Further theoretical analysis shows that our performance improvement is attributed to the virtuous cycle of causal discovery, transition modeling, and policy training, which aligns with the experimental evidence in extensive ablation studies.

Related papers

Reasoning of Large Language Models over Knowledge Graphs with Super-Relations [53.14275361052276]
We propose the ReKnoS framework, which aims to Reason over Knowledge Graphs with Super-Relations. Our framework's key advantages include the inclusion of multiple relation paths through super-relations. The results demonstrate the superior performance of ReKnoS over existing state-of-the-art baselines, with an average accuracy gain of 2.92%.
arXiv Detail & Related papers (2025-03-28T06:11:04Z)
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement [91.88062410741833]
This study investigates whether similar reasoning capabilities can be successfully integrated into large vision-language models (LVLMs) We consider an approach that iteratively leverages supervised fine-tuning (SFT) on lightweight training data and Reinforcement Learning (RL) to further improve model generalization. OpenVLThinker, a LVLM exhibiting consistently improved reasoning performance on challenging benchmarks such as MathVista, MathVerse, and MathVision, demonstrates the potential of our strategy for robust vision-language reasoning.
arXiv Detail & Related papers (2025-03-21T17:52:43Z)
Towards Empowerment Gain through Causal Structure Learning in Model-Based RL [35.933469787075]
We propose a novel framework, Empowerment through Causal Learning (ECL), to improve learning efficiency and controllability. ECL operates by first training a causal dynamics model of the environment based on collected data. We then maximize empowerment under the causal structure for exploration, simultaneously using data gathered through exploration to update causal dynamics model to be more controllable.
arXiv Detail & Related papers (2025-02-14T10:59:09Z)
On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks [56.78271181959529]
Kolmogorov--Arnold Networks (KANs) have gained significant attention in the deep learning community. Empirical investigations demonstrate that KANs optimized via gradient descent (SGD) are capable of achieving near-zero training loss.
arXiv Detail & Related papers (2024-10-10T15:34:10Z)
SAMBO-RL: Shifts-aware Model-based Offline Reinforcement Learning [9.88109749688605]
Model-based offline reinforcement learning trains policies using pre-collected datasets and learned environment models. This paper offers a comprehensive analysis that disentangles the problem into two fundamental components: model bias and policy shift. We introduce Shifts-aware Model-based Offline Reinforcement Learning (SAMBO-RL), a practical framework that efficiently trains classifiers to approximate SAR for policy optimization.
arXiv Detail & Related papers (2024-08-23T04:25:09Z)
Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning [25.514007761856632]
graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity. We argue that these methods struggle to balance between semantic invariance and view hardness across the dynamic training process. We propose a novel GCL-based recommendation framework RGCL, which effectively maintains the semantic invariance of contrastive pairs and dynamically adapts as the model capability evolves.
arXiv Detail & Related papers (2024-07-14T13:03:35Z)
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning [26.34622544479565]
Causal dynamics learning is a promising approach to enhancing robustness in reinforcement learning. We propose a novel model that infers fine-grained causal structures and employs them for prediction.
arXiv Detail & Related papers (2024-06-05T13:13:58Z)
Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy [40.33036146207819]
We consider explicitly modeling the generation process of states with the graphical causal model. We formulate the causal structure updating into the RL interaction process with active intervention learning of the environment.
arXiv Detail & Related papers (2024-02-07T14:09:34Z)
A Survey on Causal Representation Learning and Future Work for Medical Image Analysis [0.0]
Causal Representation Learning has recently been a promising direction to address the causal relationship problem in vision understanding. This survey presents recent advances in CRL in vision.
arXiv Detail & Related papers (2022-10-28T10:15:36Z)
INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL) We integrate a term inspired by variational empowerment into a state-space model based on mutual information. We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z)
Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner. We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z)
Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data. We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG) Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z)
Confounder Identification-free Causal Visual Feature Learning [84.28462256571822]
We propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders. CICF models the interventions among different samples based on front-door criterion, and then approximates the global-scope intervening effect upon the instance-level interventions. We uncover the relation between CICF and the popular meta-learning strategy MAML, and provide an interpretation of why MAML works from the theoretical perspective.
arXiv Detail & Related papers (2021-11-26T10:57:47Z)
Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning [114.07623388322048]
We discuss how the standard goal-conditioned RL (GCRL) is encapsulated by the objective variational empowerment. Our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.
arXiv Detail & Related papers (2021-06-02T18:12:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.