S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable
Reinforcement Learning
- URL: http://arxiv.org/abs/2305.07367v1
- Date: Fri, 12 May 2023 10:32:16 GMT
- Title: S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable
Reinforcement Learning
- Authors: Rajdeep Dutta, Qincheng Wang, Ankur Singh, Dhruv Kumarjiguda, Li
Xiaoli, Senthilnath Jayavelu
- Abstract summary: A novel RL algorithm, S-REINFORCE, is designed to generate interpretable policies for dynamic decision-making tasks.
By leveraging the strengths of both NN and SR, S-REINFORCE produces policies that are not only well-performing but also easy to interpret.
- Score: 0.660601600774899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a novel RL algorithm, S-REINFORCE, which is designed to
generate interpretable policies for dynamic decision-making tasks. The proposed
algorithm leverages two types of function approximators, namely Neural Network
(NN) and Symbolic Regressor (SR), to produce numerical and symbolic policies,
respectively. The NN component learns to generate a numerical probability
distribution over the possible actions using a policy gradient, while the SR
component captures the functional form that relates the associated states with
the action probabilities. The SR-generated policy expressions are then utilized
through importance sampling to improve the rewards received during the learning
process. We have tested the proposed S-REINFORCE algorithm on various dynamic
decision-making problems with low and high dimensional action spaces, and the
results demonstrate its effectiveness and impact in achieving interpretable
solutions. By leveraging the strengths of both NN and SR, S-REINFORCE produces
policies that are not only well-performing but also easy to interpret, making
it an ideal choice for real-world applications where transparency and causality
are crucial.
Related papers
- Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - Intelligent Hybrid Resource Allocation in MEC-assisted RAN Slicing Network [72.2456220035229]
We aim to maximize the SSR for heterogeneous service demands in the cooperative MEC-assisted RAN slicing system.
We propose a recurrent graph reinforcement learning (RGRL) algorithm to intelligently learn the optimal hybrid RA policy.
arXiv Detail & Related papers (2024-05-02T01:36:13Z) - A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and
Probabilistic Decision Making [42.503612515214044]
Multi-agent reinforcement learning (MARL) is well-suited for runtime decision-making in systems where multiple agents coexist and compete for shared resources.
Applying common deep learning-based MARL solutions to real-world problems suffers from issues of interpretability, sample efficiency, partial observability, etc.
We present an event-driven formulation, where decision-making is handled by distributed co-operative MARL agents using neuro-symbolic methods.
arXiv Detail & Related papers (2024-02-21T00:16:08Z) - Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - Symbolic Visual Reinforcement Learning: A Scalable Framework with
Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies.
By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions.
Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z) - Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement
Learning [0.0]
We develop an efficient approach to estimate a class of dynamic spectral risk measures with deep neural networks.
We also develop a risk-sensitive actor-critic algorithm that uses full episodes and does not require any additional nested transitions.
arXiv Detail & Related papers (2022-06-29T14:11:15Z) - Context Meta-Reinforcement Learning via Neuromodulation [6.142272540492935]
Meta-reinforcement learning (meta-RL) algorithms enable agents to adapt quickly to tasks from few samples in dynamic environments.
This paper introduces neuromodulation as a modular component to augment a standard policy network that regulates neuronal activities.
arXiv Detail & Related papers (2021-10-30T01:05:40Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - Learning Adaptive Exploration Strategies in Dynamic Environments Through
Informed Policy Regularization [100.72335252255989]
We study the problem of learning exploration-exploitation strategies that effectively adapt to dynamic environments.
We propose a novel algorithm that regularizes the training of an RNN-based policy using informed policies trained to maximize the reward in each task.
arXiv Detail & Related papers (2020-05-06T16:14:48Z) - Verifiable RNN-Based Policies for POMDPs Under Temporal Logic
Constraints [31.829932777445894]
A major drawback in the application of RNN-based policies is the difficulty in providing formal guarantees on the satisfaction of behavioral specifications.
By integrating techniques from formal methods and machine learning, we propose an approach to automatically extract a finite-state controller from an RNN.
arXiv Detail & Related papers (2020-02-13T16:38:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.