Related papers: Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

URL: http://arxiv.org/abs/2311.02104v1
Date: Thu, 2 Nov 2023 03:27:51 GMT
Title: Efficient Symbolic Policy Learning with Differentiable Symbolic Expression
Authors: Jiaming Guo, Rui Zhang, Shaohui Peng, Qi Yi, Xing Hu, Ruizhi Chen, Zidong Du, Xishan Zhang, Ling Li, Qi Guo, Yunji Chen
Abstract summary: We propose an efficient gradient-based learning method that learns the symbolic policy from scratch in an end-to-end way. In addition, in contrast with previous symbolic policies which only work in single-task RL because of complexity, we expand ESPL on meta-RL to generate symbolic policies for unseen tasks.
Score: 30.855457609733637
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way. We introduce a symbolic network as the search space and employ a path selector to find the compact symbolic policy. By doing so we represent the policy with a differentiable symbolic expression and train it in an off-policy manner which further improves the efficiency. In addition, in contrast with previous symbolic policies which only work in single-task RL because of complexity, we expand ESPL on meta-RL to generate symbolic policies for unseen tasks. Experimentally, we show that our approach generates symbolic policies with higher performance and greatly improves data efficiency for single-task RL. In meta-RL, we demonstrate that compared with neural network policies the proposed symbolic policy achieves higher performance and efficiency and shows the potential to be interpretable.

Related papers

BASIL: Best-Action Symbolic Interpretable Learning for Evolving Compact RL Policies [0.0]
BASIL (Best-Action Symbolic Interpretable Learning) is a systematic approach for generating symbolic, rule-based policies.<n>This article introduces a new interpretable policy synthesis method that combines symbolic expressiveness, evolutionary diversity, and online learning.
arXiv Detail & Related papers (2025-05-31T00:47:24Z)
Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning [27.868175900131313]
Reinforcement learning (RL) aims to estimate the action to take given a (time-varying) state. This paper postulates multi-linear mappings to efficiently estimate the parameters of the RL policy. We leverage the PARAFAC decomposition to design tensor low-rank policies.
arXiv Detail & Related papers (2025-01-08T23:22:08Z)
SYMPOL: Symbolic Tree-Based On-Policy Reinforcement Learning [9.035959289139102]
We introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches.
arXiv Detail & Related papers (2024-08-16T14:04:40Z)
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations [15.530907808235945]
We present a neuro-symbolic framework for jointly learning structured states and symbolic policies. We design a pipeline to prompt GPT-4 to generate textual explanations for the learned policies and decisions. We verify the efficacy of our approach on nine Atari tasks and present GPT-generated explanations for policies and decisions.
arXiv Detail & Related papers (2024-03-19T05:21:20Z)
Compositional Policy Learning in Stochastic Control Systems with Formal Guarantees [0.0]
Reinforcement learning has shown promising results in learning neural network policies for complicated control tasks. We propose a novel method for learning a composition of neural network policies in environments. A formal certificate guarantees that a specification over the policy's behavior is satisfied with the desired probability.
arXiv Detail & Related papers (2023-12-03T17:04:18Z)
Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies. By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions. Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z)
Symbolic Distillation for Learned TCP Congestion Control [70.27367981153299]
TCP congestion control has achieved tremendous success with deep reinforcement learning (RL) approaches. Black-box policies lack interpretability and reliability, and often, they need to operate outside the traditional TCP datapath. This paper proposes a novel two-stage solution to achieve the best of both worlds: first, to train a deep RL agent, then distill its NN policy into white-box, light-weight rules.
arXiv Detail & Related papers (2022-10-24T00:58:16Z)
Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy. In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks. We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z)
Model-Based Offline Meta-Reinforcement Learning with Regularization [63.35040401948943]
offline Meta-RL is emerging as a promising approach to address these challenges. MerPO learns a meta-model for efficient task structure inference and an informative meta-policy. We show that MerPO offers guaranteed improvement over both the behavior policy and the meta-policy.
arXiv Detail & Related papers (2022-02-07T04:15:20Z)
Neuro-Symbolic Reinforcement Learning with First-Order Logic [63.003353499732434]
We propose a novel RL method for text-based games with a recent neuro-symbolic framework called Logical Neural Network. Our experimental results show RL training with the proposed method converges significantly faster than other state-of-the-art neuro-symbolic methods in a TextWorld benchmark.
arXiv Detail & Related papers (2021-10-21T08:21:49Z)
Neurosymbolic Reinforcement Learning with Formally Verified Exploration [21.23874800091344]
We present Revel, a framework for provably safe exploration in continuous state and action spaces. A key challenge for provably safe deep RL is that repeatedly verifying neural networks within a learning loop is computationally infeasible. We address this challenge using two policy classes: a general, neurosymbolic class with approximate gradients and a more restricted class of symbolic policies that allows efficient verification.
arXiv Detail & Related papers (2020-09-26T14:51:04Z)
Efficient Deep Reinforcement Learning via Adaptive Policy Transfer [50.51637231309424]
Policy Transfer Framework (PTF) is proposed to accelerate Reinforcement Learning (RL) Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it. Experimental results show it significantly accelerates the learning process and surpasses state-of-the-art policy transfer methods.
arXiv Detail & Related papers (2020-02-19T07:30:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.