Integrating Acting, Planning and Learning in Hierarchical Operational
Models
- URL: http://arxiv.org/abs/2003.03932v1
- Date: Mon, 9 Mar 2020 06:05:25 GMT
- Title: Integrating Acting, Planning and Learning in Hierarchical Operational
Models
- Authors: Sunandita Patra, James Mason, Amit Kumar, Malik Ghallab, Paolo
Traverso, Dana Nau
- Abstract summary: We present new planning and learning algorithms for RAE, the Refinement Acting Engine.
Our planning procedure, UPOM, does a UCT-like search in the space of operational models in order to find a near-optimal method to use for the task and context at hand.
Our experimental results show that UPOM and our learning strategies significantly improve RAE's performance in four test domains.
- Score: 7.009282389520865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present new planning and learning algorithms for RAE, the Refinement
Acting Engine. RAE uses hierarchical operational models to perform tasks in
dynamically changing environments. Our planning procedure, UPOM, does a
UCT-like search in the space of operational models in order to find a
near-optimal method to use for the task and context at hand. Our learning
strategies acquire, from online acting experiences and/or simulated planning
results, a mapping from decision contexts to method instances as well as a
heuristic function to guide UPOM. Our experimental results show that UPOM and
our learning strategies significantly improve RAE's performance in four test
domains using two different metrics: efficiency and success ratio.
Related papers
- EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty.
We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications.
Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z) - Meta-Gradient Search Control: A Method for Improving the Efficiency of Dyna-style Planning [8.552540426753]
This paper introduces an online, meta-gradient algorithm that tunes a probability with which states are queried during Dyna-style planning.
Results indicate that our method improves efficiency of the planning process.
arXiv Detail & Related papers (2024-06-27T22:24:46Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL)
Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning.
We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z) - Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism [7.479892725446205]
Multi-task learning (MTL) is a paradigm that simultaneously learns multiple tasks by sharing information at different levels.
We introduce a posteriori information into the model, considering that different tasks may produce correlated outputs with mutual influences.
We achieve this by incorporating a feedback mechanism into MTL models, where the output of one task serves as a hidden feature for another task.
arXiv Detail & Related papers (2024-04-01T03:27:34Z) - Action Pick-up in Dynamic Action Space Reinforcement Learning [6.15205100319133]
We propose an intelligent Action Pick-up (AP) algorithm to autonomously choose valuable actions that are most likely to boost performance from a set of new actions.
In this paper, we first theoretically analyze and find that a prior optimal policy plays an important role in action pick-up by providing useful knowledge and experience.
We then design two different AP methods: frequency-based global method and state clustering-based local method, based on the prior optimal policy.
arXiv Detail & Related papers (2023-04-03T10:55:16Z) - Deliberative Acting, Online Planning and Learning with Hierarchical
Operational Models [5.597986898418404]
In AI research, a plan of action has typically used descriptive models of the actions that abstractly specify what might happen as a result of an action.
executing the planned actions has needed operational models, in which rich computational control structures and closed-loop online decision-making are used.
We implement an integrated acting and planning system in which both planning and acting use the same operational models.
arXiv Detail & Related papers (2020-10-02T14:50:05Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z) - Model-based Adversarial Meta-Reinforcement Learning [38.28304764312512]
We propose Model-based Adversarial Meta-Reinforcement Learning (AdMRL)
AdMRL aims to minimize the worst-case sub-optimality gap across all tasks in a family of tasks.
We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks.
arXiv Detail & Related papers (2020-06-16T02:21:49Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.