Hierarchical Task Network Planning for Facilitating Cooperative
Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2306.08359v1
- Date: Wed, 14 Jun 2023 08:51:43 GMT
- Title: Hierarchical Task Network Planning for Facilitating Cooperative
Multi-Agent Reinforcement Learning
- Authors: Xuechen Mu, Hankz Hankui Zhuo, Chen Chen, Kai Zhang, Chao Yu and
Jianye Hao
- Abstract summary: We present SOMARL, a framework that uses prior knowledge to reduce the exploration space and assist learning.
In SOMARL, agents are treated as part of the MARL environment, and symbolic knowledge is embedded using a tree structure to build a knowledge hierarchy.
We evaluate SOMARL on two benchmarks, FindTreasure and MoveBox, and report superior performance over state-of-the-art MARL environments.
- Score: 33.70599981505335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploring sparse reward multi-agent reinforcement learning (MARL)
environments with traps in a collaborative manner is a complex task. Agents
typically fail to reach the goal state and fall into traps, which affects the
overall performance of the system. To overcome this issue, we present SOMARL, a
framework that uses prior knowledge to reduce the exploration space and assist
learning. In SOMARL, agents are treated as part of the MARL environment, and
symbolic knowledge is embedded using a tree structure to build a knowledge
hierarchy. The framework has a two-layer hierarchical structure, comprising a
hybrid module with a Hierarchical Task Network (HTN) planning and
meta-controller at the higher level, and a MARL-based interactive module at the
lower level. The HTN module and meta-controller use Hierarchical Domain
Definition Language (HDDL) and the option framework to formalize symbolic
knowledge and obtain domain knowledge and a symbolic option set, respectively.
Moreover, the HTN module leverages domain knowledge to guide low-level agent
exploration by assisting the meta-controller in selecting symbolic options. The
meta-controller further computes intrinsic rewards of symbolic options to limit
exploration behavior and adjust HTN planning solutions as needed. We evaluate
SOMARL on two benchmarks, FindTreasure and MoveBox, and report superior
performance over state-of-the-art MARL and subgoal-based baselines for MARL
environments significantly.
Related papers
- TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning [4.591755344464076]
We introduce TAME Agent Framework (TAG), a framework for constructing fully decentralized hierarchical multi-agent systems.
TAG standardizes information flow between levels while preserving loose coupling, allowing for seamless integration of diverse agent types.
Our results show that decentralized hierarchical organization enhances both learning speed and final performance, positioning TAG as a promising direction for scalable multi-agent systems.
arXiv Detail & Related papers (2025-02-21T12:52:16Z) - Hierarchical Repository-Level Code Summarization for Business Applications Using Local LLMs [1.4932549821542682]
Existing methods primarily focus on smaller code units, such as functions, and struggle with larger code artifacts like files and packages.
This paper proposes a two-step hierarchical approach for repository-level code summarization, tailored to business applications.
arXiv Detail & Related papers (2025-01-14T05:48:27Z) - Reinforcement Learning with Options and State Representation [105.82346211739433]
This thesis aims to explore the reinforcement learning field and build on existing methods to produce improved ones.
It addresses such goals by decomposing learning tasks in a hierarchical fashion known as Hierarchical Reinforcement Learning.
arXiv Detail & Related papers (2024-03-16T08:30:55Z) - Hierarchical Spatio-Temporal Representation Learning for Gait
Recognition [6.877671230651998]
Gait recognition is a biometric technique that identifies individuals by their unique walking styles.
We propose a hierarchical-temporal representation learning framework for extracting gait features from coarse to fine.
Our method outperforms the state-of-the-art while maintaining a reasonable balance between model accuracy and complexity.
arXiv Detail & Related papers (2023-07-19T09:30:00Z) - Feudal Graph Reinforcement Learning [18.069747511100132]
Graph-based representations and message-passing modular policies constitute prominent approaches to tackling composable control problems in reinforcement learning (RL)
We propose a novel methodology, named Feudal Graph Reinforcement Learning (FGRL), that addresses such challenges by relying on hierarchical RL and a pyramidal message-passing architecture.
In particular, FGRL defines a hierarchy of policies where high-level commands are propagated from the top of the hierarchy down through a layered graph structure.
arXiv Detail & Related papers (2023-04-11T09:51:13Z) - Learning Rational Subgoals from Demonstrations and Instructions [71.86713748450363]
We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals.
At the core of our framework is a collection of rational subgoals (RSGs), which are essentially binary classifiers over the environmental states.
Given a goal description, the learned subgoals and the derived dependencies facilitate off-the-shelf planning algorithms, such as A* and RRT.
arXiv Detail & Related papers (2023-03-09T18:39:22Z) - Weakly-supervised Action Localization via Hierarchical Mining [76.00021423700497]
Weakly-supervised action localization aims to localize and classify action instances in the given videos temporally with only video-level categorical labels.
We propose a hierarchical mining strategy under video-level and snippet-level manners, i.e., hierarchical supervision and hierarchical consistency mining.
We show that HiM-Net outperforms existing methods on THUMOS14 and ActivityNet1.3 datasets with large margins by hierarchically mining the supervision and consistency.
arXiv Detail & Related papers (2022-06-22T12:19:09Z) - Transfering Hierarchical Structure with Dual Meta Imitation Learning [4.868214177205893]
We propose a hierarchical meta imitation learning method where the high-level network and sub-skills are iteratively meta-learned with model-agnostic meta-learning.
We achieve state-of-the-art few-shot imitation learning performance on the Meta-world citemetaworld benchmark and competitive results on long-horizon tasks of Kitchen environments.
arXiv Detail & Related papers (2022-01-28T08:22:38Z) - Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task.
We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy.
Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z) - From proprioception to long-horizon planning in novel environments: A
hierarchical RL model [4.44317046648898]
In this work, we introduce a simple, three-level hierarchical architecture that reflects different types of reasoning.
We apply our method to a series of navigation tasks in the Mujoco Ant environment.
arXiv Detail & Related papers (2020-06-11T17:19:12Z) - Learning Functionally Decomposed Hierarchies for Continuous Control
Tasks with Path Planning [36.050432925402845]
We present HiDe, a novel hierarchical reinforcement learning architecture that successfully solves long horizon control tasks.
We experimentally show that our method generalizes across unseen test environments and can scale to 3x horizon length compared to both learning and non-learning based methods.
arXiv Detail & Related papers (2020-02-14T10:19:52Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.