Related papers: Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity

Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity

URL: http://arxiv.org/abs/2202.02886v1
Date: Sun, 6 Feb 2022 23:20:30 GMT
Title: Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity
Authors: Lin Guan, Sarath Sreedharan, Subbarao Kambhampati
Abstract summary: We introduce Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP. We will use these models to extract high-level landmarks that will be used to decompose the task. At the low level, we learn a set of diverse policies for each possible task sub-goal identified by the landmark.
Score: 32.35693772984721
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Creating reinforcement learning (RL) agents that are capable of accepting and leveraging task-specific knowledge from humans has been long identified as a possible strategy for developing scalable approaches for solving long-horizon problems. While previous works have looked at the possibility of using symbolic models along with RL approaches, they tend to assume that the high-level action models are executable at low level and the fluents can exclusively characterize all desirable MDP states. This need not be true and this assumption overlooks one of the central technical challenges of incorporating symbolic task knowledge, namely, that these symbolic models are going to be an incomplete representation of the underlying task. To this end, we introduce Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP that will allow us to capture the incompleteness of the symbolic model. We will use these models to extract high-level landmarks that will be used to decompose the task, and at the low level, we learn a set of diverse policies for each possible task sub-goal identified by the landmark. We evaluate our system by testing on three different benchmark domains and we show how even with incomplete symbolic model information, our approach is able to discover the task structure and efficiently guide the RL agent towards the goal.

Related papers

Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models [71.34520793462069]
Unsupervised reinforcement learning (RL) aims at pre-training agents that can solve a wide range of downstream tasks in complex environments. We introduce a novel algorithm regularizing unsupervised RL towards imitating trajectories from unlabeled behavior datasets. We demonstrate the effectiveness of this new approach in a challenging humanoid control problem.
arXiv Detail & Related papers (2025-04-15T10:41:11Z)
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement [41.7426496795769]
We propose Meta Decision Transformer (Meta-DT) to achieve efficient generalization in offline meta-RL. We pretrain a context-aware world model to learn a compact task representation, and inject it as a contextual condition to guide task-oriented sequence generation. We show that Meta-DT exhibits superior few and zero-shot generalization capacity compared to strong baselines.
arXiv Detail & Related papers (2024-10-15T09:51:30Z)
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities [72.68829963458408]
We present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage.
arXiv Detail & Related papers (2024-04-20T08:34:39Z)
Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction. CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z)
A Bayesian Unification of Self-Supervised Clustering and Energy-Based Models [11.007541337967027]
We perform a Bayesian analysis of state-of-the-art self-supervised learning objectives. We show that our objective function allows to outperform existing self-supervised learning strategies. We also demonstrate that GEDI can be integrated into a neuro-symbolic framework.
arXiv Detail & Related papers (2023-12-30T04:46:16Z)
A Novel Neural-symbolic System under Statistical Relational Learning [50.747658038910565]
We propose a general bi-level probabilistic graphical reasoning framework called GBPGR. In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models. Our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.
arXiv Detail & Related papers (2023-09-16T09:15:37Z)
Goal Space Abstraction in Hierarchical Reinforcement Learning via Reachability Analysis [0.0]
We propose a developmental mechanism for subgoal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states. We create a HRL algorithm that gradually learns this representation along with the policies and evaluate it on navigation tasks to show the learned representation is interpretable and results in data efficiency.
arXiv Detail & Related papers (2023-09-12T06:53:11Z)
Prototype-guided Cross-task Knowledge Distillation for Large-scale Models [103.04711721343278]
Cross-task knowledge distillation helps to train a small student model to obtain a competitive performance. We propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.
arXiv Detail & Related papers (2022-12-26T15:00:42Z)
INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL) We integrate a term inspired by variational empowerment into a state-space model based on mutual information. We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z)
SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning [24.663586662594703]
Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. Reinforcement learning approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards. We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed to accomplish goals that are initially unreachable for the agent.
arXiv Detail & Related papers (2020-12-24T00:31:02Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
Transferable Task Execution from Pixels through Deep Planning Domain Learning [46.88867228115775]
We propose Deep Planning Domain Learning (DPDL) to learn a hierarchical model. DPDL learns a high-level model which predicts values for a set of logical predicates consisting of the current symbolic world state. This allows us to perform complex, multi-step tasks even when the robot has not been explicitly trained on them.
arXiv Detail & Related papers (2020-03-08T05:51:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.