Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
- URL: http://arxiv.org/abs/2507.19458v1
- Date: Fri, 25 Jul 2025 17:42:34 GMT
- Title: Hierarchical Deep Reinforcement Learning Framework for Multi-Year Asset Management Under Budget Constraints
- Authors: Amir Fard, Arnold X. -X. Yuan,
- Abstract summary: This paper proposes a Hierarchical Deep Reinforcement Learning methodology specifically tailored to multi-year infrastructure planning.<n>Our approach decomposes the problem into two hierarchical levels: a high-level Budget Planner allocating annual budgets within explicit feasibility bounds, and a low-level Maintenance Planner prioritizing assets within the allocated budget.<n>By structurally separating macro-budget decisions from asset-level prioritization and integrating linear programming projection within a hierarchical Soft Actor-Critic framework, the method efficiently addresses exponential growth in the action space and ensures rigorous budget compliance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Budget planning and maintenance optimization are crucial for infrastructure asset management, ensuring cost-effectiveness and sustainability. However, the complexity arising from combinatorial action spaces, diverse asset deterioration, stringent budget constraints, and environmental uncertainty significantly limits existing methods' scalability. This paper proposes a Hierarchical Deep Reinforcement Learning methodology specifically tailored to multi-year infrastructure planning. Our approach decomposes the problem into two hierarchical levels: a high-level Budget Planner allocating annual budgets within explicit feasibility bounds, and a low-level Maintenance Planner prioritizing assets within the allocated budget. By structurally separating macro-budget decisions from asset-level prioritization and integrating linear programming projection within a hierarchical Soft Actor-Critic framework, the method efficiently addresses exponential growth in the action space and ensures rigorous budget compliance. A case study evaluating sewer networks of varying sizes (10, 15, and 20 sewersheds) illustrates the effectiveness of the proposed approach. Compared to conventional Deep Q-Learning and enhanced genetic algorithms, our methodology converges more rapidly, scales effectively, and consistently delivers near-optimal solutions even as network size grows.
Related papers
- Multi-Year Maintenance Planning for Large-Scale Infrastructure Systems: A Novel Network Deep Q-Learning Approach [0.0]
This paper presents a novel deep reinforcement learning framework that optimize asset management strategies for large infrastructure networks.<n>By decomposing the network-level Markov Decision Process (MDP) into individual asset-level MDPs, the proposed framework reduces computational complexity, improves learning efficiency, and enhances scalability.<n>The framework directly incorporates annual budget constraints through a budget allocation mechanism, ensuring maintenance plans are both optimal and cost-effective.
arXiv Detail & Related papers (2025-07-24T18:27:31Z) - Hierarchical Budget Policy Optimization for Adaptive Reasoning [49.621779447691665]
We present Hierarchical Budget Policy Optimization (HBPO), a reinforcement learning framework that enables models to learn problem-specific reasoning depths without sacrificing capability.<n>HBPO addresses the challenge of exploration space collapse in efficiency-oriented training, where penalties on long output length systematically bias models away from necessary long reasoning paths.<n>Extensive experiments demonstrate that HBPO reduces average token usage by up to 60.6% while improving accuracy by 3.14% across four reasoning benchmarks.
arXiv Detail & Related papers (2025-07-21T17:52:34Z) - Decomposability-Guaranteed Cooperative Coevolution for Large-Scale Itinerary Planning [6.565536870180592]
Large-scale itinerary planning is a variant of the traveling salesman problem.<n>This paper analyzes the decomposability of large-scale itinerary planning.<n>We propose a novel multi-objective cooperative coevolutionary algorithm for large-scale itinerary planning.
arXiv Detail & Related papers (2025-06-06T14:31:57Z) - On Sequential Fault-Intolerant Process Planning [60.66853798340345]
We propose and study a planning problem we call Sequential Fault-Intolerant Process Planning (SFIPP)<n>SFIPP captures a reward structure common in many sequential multi-stage decision problems where the planning is deemed successful only if all stages succeed.<n>We design provably tight online algorithms for settings in which we need to pick between different actions with unknown success chances at each stage.
arXiv Detail & Related papers (2025-02-07T15:20:35Z) - Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.<n>We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - Hierarchical Upper Confidence Bounds for Constrained Online Learning [4.8951183832371]
We introduce the hierarchical constrained bandits (HCB) framework, which extends the contextual bandit problem to incorporate hierarchical decision structures and multi-level constraints.
Our theoretical analysis establishes sublinear regret bounds for HC-UCB and provides high-probability guarantees for constraint satisfaction at all hierarchical levels.
arXiv Detail & Related papers (2024-10-22T17:41:14Z) - A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints [66.61399765513383]
We develop a BLOCC algorithm to tackle BiLevel Optimization problems with Coupled Constraints.
We demonstrate its effectiveness on two well-known real-world applications.
arXiv Detail & Related papers (2024-06-14T15:59:36Z) - Scalable Online Exploration via Coverability [45.66375686120087]
Exploration is a major challenge in reinforcement learning, especially for high-dimensional domains that require function approximation.
We introduce a new objective, $L_Coverage, which generalizes previous exploration schemes and supports three fundamental desideratas.
$L_Coverage enables the first computationally efficient model-based and model-free algorithms for online (reward-free or reward-driven) reinforcement learning in MDPs with low coverability.
arXiv Detail & Related papers (2024-03-11T10:14:06Z) - Welfare Maximization Algorithm for Solving Budget-Constrained
Multi-Component POMDPs [2.007262412327553]
This paper presents an algorithm to find the optimal policy for a multi-component budget-constrained POMDP.
We show that the proposed algorithm vastly outperforms the policy currently used in practice.
arXiv Detail & Related papers (2023-03-18T01:43:47Z) - Hierarchical Constrained Stochastic Shortest Path Planning via Cost
Budget Allocation [16.150627252426936]
We propose a hierarchical constrained shortest path problem (HC-SSP) that meets those two crucial requirements in a single framework.
The resulting problem has high complexity and makes it difficult to find an optimal solution fast.
We present an algorithm that iteratively allocates cost budget to lower level planning problems based on branch-and-bound scheme to find a feasible solution fast and incrementally update the incumbent solution.
arXiv Detail & Related papers (2022-05-11T01:25:38Z) - Coverage and Capacity Optimization in STAR-RISs Assisted Networks: A
Machine Learning Approach [102.00221938474344]
A novel model is proposed for the coverage and capacity optimization of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) assisted networks.
A loss function-based update strategy is the core point, which is able to calculate weights for both loss functions of coverage and capacity by a min-norm solver at each update.
The numerical results demonstrate that the investigated update strategy outperforms the fixed weight-based MO algorithms.
arXiv Detail & Related papers (2022-04-13T13:52:22Z) - FactorizeNet: Progressive Depth Factorization for Efficient Network
Architecture Exploration Under Quantization Constraints [93.4221402881609]
We introduce a progressive depth factorization strategy for efficient CNN architecture exploration under quantization constraints.
By algorithmically increasing the granularity of depth factorization in a progressive manner, the proposed strategy enables a fine-grained, low-level analysis of layer-wise distributions.
Such a progressive depth factorization strategy also enables efficient identification of the optimal depth-factorized macroarchitecture design.
arXiv Detail & Related papers (2020-11-30T07:12:26Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.