Hierarchical Decision Making Based on Structural Information Principles
- URL: http://arxiv.org/abs/2404.09760v2
- Date: Sun, 22 Jun 2025 14:07:17 GMT
- Title: Hierarchical Decision Making Based on Structural Information Principles
- Authors: Xianghua Zeng, Hao Peng, Dingli Su, Angsheng Li,
- Abstract summary: We propose a novel Structural Information principles-based framework, namely SIDM, for hierarchical Decision Making.<n>We present an abstraction mechanism that processes historical state-action trajectories to construct abstract representations of states and actions.<n>We develop a skill-based learning method for single-agent scenarios and a role-based collaboration method for multi-agent scenarios, both of which can flexibly integrate various underlying algorithms for enhanced performance.
- Score: 19.82391136775341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical Reinforcement Learning (HRL) is a promising approach for managing task complexity across multiple levels of abstraction and accelerating long-horizon agent exploration. However, the effectiveness of hierarchical policies heavily depends on prior knowledge and manual assumptions about skill definitions and task decomposition. In this paper, we propose a novel Structural Information principles-based framework, namely SIDM, for hierarchical Decision Making in both single-agent and multi-agent scenarios. Central to our work is the utilization of structural information embedded in the decision-making process to adaptively and dynamically discover and learn hierarchical policies through environmental abstractions. Specifically, we present an abstraction mechanism that processes historical state-action trajectories to construct abstract representations of states and actions. We define and optimize directed structural entropy, a metric quantifying the uncertainty in transition dynamics between abstract states, to discover skills that capture key transition patterns in RL environments. Building on these findings, we develop a skill-based learning method for single-agent scenarios and a role-based collaboration method for multi-agent scenarios, both of which can flexibly integrate various underlying algorithms for enhanced performance. Extensive evaluations on challenging benchmarks demonstrate that our framework significantly and consistently outperforms state-of-the-art baselines, improving the effectiveness, efficiency, and stability of policy learning by up to 32.70%, 64.86%, and 88.26%, respectively, as measured by average rewards, convergence timesteps, and standard deviations.
Related papers
- Feature-Based vs. GAN-Based Learning from Demonstrations: When and Why [50.191655141020505]
This survey provides a comparative analysis of feature-based and GAN-based approaches to learning from demonstrations.<n>We argue that the dichotomy between feature-based and GAN-based methods is increasingly nuanced.
arXiv Detail & Related papers (2025-07-08T11:45:51Z) - Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems [29.924868489451327]
This study systematically investigates four dimensions of collaboration strategies.<n>We quantify the impact of these strategies on both task accuracy and computational efficiency.<n>This work establishes a foundation for designing adaptive, scalable multi-agent systems.
arXiv Detail & Related papers (2025-05-18T15:46:14Z) - A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z) - FORCE: Feature-Oriented Representation with Clustering and Explanation [0.0]
We propose a SHAP based supervised deep learning framework FORCE.
It relies on two-stage usage of SHAP values in the neural network architecture.
We show that FORCE led to dramatic improvements in overall performance as compared to networks that did not incorporate the latent feature and attention framework.
arXiv Detail & Related papers (2025-04-07T22:05:50Z) - Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.
We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.
Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity [51.40558987254471]
Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations.
This paper addresses the question of reinforcement learning under $textitgeneral$ latent dynamics from a statistical and algorithmic perspective.
arXiv Detail & Related papers (2024-10-23T14:22:49Z) - Effective Exploration Based on the Structural Information Principles [21.656199029188056]
We propose a novel Structural Information principles-based Effective Exploration framework, namely SI2E.
We show that SI2E significantly outperforms state-of-the-art exploration baselines regarding final performance and sample efficiency.
arXiv Detail & Related papers (2024-10-09T07:19:16Z) - ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling [44.276285521929424]
We introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states.
Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy.
Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies.
arXiv Detail & Related papers (2024-04-05T09:39:47Z) - A Clustering Method with Graph Maximum Decoding Information [6.11503045313947]
We present a novel clustering method for maximizing decoding information within graph-based models, named CMDI.
CMDI incorporates two-dimensional structural information theory into the clustering process, consisting of two phases: graph structure extraction and graph partitioning.
Empirical evaluations on three real-world datasets demonstrate that CMDI outperforms classical baseline methods, exhibiting a superior decoding information ratio (DI-R)
These findings underscore the effectiveness of CMDI in enhancing decoding information quality and computational efficiency, positioning it as a valuable tool in graph-based clustering analyses.
arXiv Detail & Related papers (2024-03-18T05:18:19Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z) - Hierarchical Ensemble-Based Feature Selection for Time Series Forecasting [0.0]
We introduce a novel ensemble approach for feature selection based on hierarchical stacking for non-stationarity.
Our approach exploits the co-dependency between features using a hierarchical structure.
The effectiveness of the approach is demonstrated on synthetic and well-known real-life datasets.
arXiv Detail & Related papers (2023-10-26T16:40:09Z) - Hierarchical State Abstraction Based on Structural Information
Principles [70.24495170921075]
We propose a novel mathematical Structural Information principles-based State Abstraction framework, namely SISA, from the information-theoretic perspective.
SISA is a general framework that can be flexibly integrated with different representation-learning objectives to improve their performances further.
arXiv Detail & Related papers (2023-04-24T11:06:52Z) - Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - Macro-Action-Based Multi-Agent/Robot Deep Reinforcement Learning under
Partial Observability [4.111899441919164]
State-of-the-art multi-agent reinforcement learning (MARL) methods have provided promising solutions to a variety of complex problems.
We first propose a group of value-based RL approaches for MacDec-POMDPs.
We formulate a set of macro-action-based policy gradient algorithms under the three training paradigms.
arXiv Detail & Related papers (2022-09-20T21:13:51Z) - Inference and dynamic decision-making for deteriorating systems with
probabilistic dependencies through Bayesian networks and deep reinforcement
learning [0.0]
We propose an efficient algorithmic framework for inference and decision-making under uncertainty for engineering systems exposed to deteriorating environments.
In terms of policy optimization, we adopt a deep decentralized multi-agent actor-critic (DDMAC) reinforcement learning approach.
Results demonstrate that DDMAC policies offer substantial benefits when compared to state-of-the-art approaches.
arXiv Detail & Related papers (2022-09-02T14:45:40Z) - Weakly Supervised Semantic Segmentation via Alternative Self-Dual
Teaching [82.71578668091914]
This paper establishes a compact learning framework that embeds the classification and mask-refinement components into a unified deep model.
We propose a novel alternative self-dual teaching (ASDT) mechanism to encourage high-quality knowledge interaction.
arXiv Detail & Related papers (2021-12-17T11:56:56Z) - The Gradient Convergence Bound of Federated Multi-Agent Reinforcement
Learning with Efficient Communication [20.891460617583302]
The paper considers independent reinforcement learning (IRL) for collaborative decision-making in the paradigm of federated learning (FL)
FL generates excessive communication overheads between agents and a remote central server.
This paper proposes two advanced optimization schemes to improve the system's utility value.
arXiv Detail & Related papers (2021-03-24T07:21:43Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.