Relational-Grid-World: A Novel Relational Reasoning Environment and An
Agent Model for Relational Information Extraction
- URL: http://arxiv.org/abs/2007.05961v1
- Date: Sun, 12 Jul 2020 11:30:48 GMT
- Title: Relational-Grid-World: A Novel Relational Reasoning Environment and An
Agent Model for Relational Information Extraction
- Authors: Faruk Kucuksubasi and Elif Surer
- Abstract summary: Reinforcement learning (RL) agents are often designed specifically for a particular problem and they generally have uninterpretable working processes.
Statistical methods-based RL algorithms can be improved in terms of generalizability and interpretability using symbolic Artificial Intelligence (AI) tools such as logic programming.
We present a model-free RL architecture that is supported with explicit relational representations of the environmental objects.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) agents are often designed specifically for a
particular problem and they generally have uninterpretable working processes.
Statistical methods-based agent algorithms can be improved in terms of
generalizability and interpretability using symbolic Artificial Intelligence
(AI) tools such as logic programming. In this study, we present a model-free RL
architecture that is supported with explicit relational representations of the
environmental objects. For the first time, we use the PrediNet network
architecture in a dynamic decision-making problem rather than image-based
tasks, and Multi-Head Dot-Product Attention Network (MHDPA) as a baseline for
performance comparisons. We tested two networks in two environments ---i.e.,
the baseline Box-World environment and our novel environment,
Relational-Grid-World (RGW). With the procedurally generated RGW environment,
which is complex in terms of visual perceptions and combinatorial selections,
it is easy to measure the relational representation performance of the RL
agents. The experiments were carried out using different configurations of the
environment so that the presented module and the environment were compared with
the baselines. We reached similar policy optimization performance results with
the PrediNet architecture and MHDPA; additionally, we achieved to extract the
propositional representation explicitly ---which makes the agent's statistical
policy logic more interpretable and tractable. This flexibility in the agent's
policy provides convenience for designing non-task-specific agent
architectures. The main contributions of this study are two-fold ---an RL agent
that can explicitly perform relational reasoning, and a new environment that
measures the relational reasoning capabilities of RL agents.
Related papers
- AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction [10.65417796726349]
relation extraction (RE) in complex scenarios faces challenges such as diverse relation types and ambiguous relations between entities within a single sentence.
We propose an agent-based RE framework, namely AgentRE, which fully leverages the potential of large language models to achieve RE in complex scenarios.
arXiv Detail & Related papers (2024-09-03T12:53:05Z) - Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions [68.92637077909693]
This paper investigates the faithfulness of multimodal large language model (MLLM) agents in the graphical user interface (GUI) environment.
A general setting is proposed where both the user and the agent are benign, and the environment, while not malicious, contains unrelated content.
Experimental results reveal that even the most powerful models, whether generalist agents or specialist GUI agents, are susceptible to distractions.
arXiv Detail & Related papers (2024-08-05T15:16:22Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents [9.529492371336286]
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors.
We propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS)
LSTS learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification.
arXiv Detail & Related papers (2024-02-06T04:00:21Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Reinforcement Learning with Temporal-Logic-Based Causal Diagrams [25.538860320318943]
We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals.
While these machines model the reward function, they often overlook the causal knowledge about the environment.
We propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment.
arXiv Detail & Related papers (2023-06-23T18:42:27Z) - Multi-Agent Reinforcement Learning for Microprocessor Design Space
Exploration [71.95914457415624]
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency.
We propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem.
Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines.
arXiv Detail & Related papers (2022-11-29T17:10:24Z) - A Framework for Understanding and Visualizing Strategies of RL Agents [0.0]
We present a framework for learning comprehensible models of sequential decision tasks in which agent strategies are characterized using temporal logic formulas.
We evaluate our framework on combat scenarios in StarCraft II (SC2) using traces from a handcrafted expert policy and a trained reinforcement learning agent.
arXiv Detail & Related papers (2022-08-17T21:58:19Z) - Soft Hierarchical Graph Recurrent Networks for Many-Agent Partially
Observable Environments [9.067091068256747]
We propose a novel network structure called hierarchical graph recurrent network(HGRN) for multi-agent cooperation under partial observability.
Based on the above technologies, we proposed a value-based MADRL algorithm called Soft-HGRN and its actor-critic variant named SAC-HRGN.
arXiv Detail & Related papers (2021-09-05T09:51:25Z) - Policy Information Capacity: Information-Theoretic Measure for Task
Complexity in Deep Reinforcement Learning [83.66080019570461]
We propose two environment-agnostic, algorithm-agnostic quantitative metrics for task difficulty.
We show that these metrics have higher correlations with normalized task solvability scores than a variety of alternatives.
These metrics can also be used for fast and compute-efficient optimizations of key design parameters.
arXiv Detail & Related papers (2021-03-23T17:49:50Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.