Techno-economic optimization of a heat-pipe microreactor, part II: multi-objective optimization analysis
- URL: http://arxiv.org/abs/2601.20079v1
- Date: Tue, 27 Jan 2026 21:54:25 GMT
- Title: Techno-economic optimization of a heat-pipe microreactor, part II: multi-objective optimization analysis
- Authors: Paul Seurin, Dean Price,
- Abstract summary: Heat-pipe microreactors (HPMRs) are well-suited for deployment in remote regions where access is limited and reliance on costly fuels is prevalent.<n>We develop a framework that incorporates surrogate modeling and reinforcement learning (RL)-based optimization.<n>Four key strategies consistently emerged for optimizing L scenarios.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Heat-pipe microreactors (HPMRs) are compact and transportable nuclear power systems exhibiting inherent safety, well-suited for deployment in remote regions where access is limited and reliance on costly fossil fuels is prevalent. In prior work, we developed a design optimization framework that incorporates techno-economic considerations through surrogate modeling and reinforcement learning (RL)-based optimization, focusing solely on minimizing the levelized cost of electricity (LCOE) by using a bottom-up cost estimation approach. In this study, we extend that framework to a multi-objective optimization that uses the Pareto Envelope Augmented with Reinforcement Learning (PEARL) algorithm. The objectives include minimizing both the rod-integrated peaking factor ($F_{Δh}$) and LCOE -- subject to safety and operational constraints. We evaluate three cost scenarios: (1) a high-cost axial and drum reflectors, (2) a low-cost axial reflector, and (3) low-cost axial and drum reflectors. Our findings indicate that reducing the solid moderator radius, pin pitch, and drum coating angle -- all while increasing the fuel height -- effectively lowers $F_{Δh}$. Across all three scenarios, four key strategies consistently emerged for optimizing LCOE: (1) minimizing the axial reflector contribution when costly, (2) reducing control drum reliance, (3) substituting expensive tri-structural isotropic (TRISO) fuel with axial reflector material priced at the level of graphite, and (4) maximizing fuel burnup. While PEARL demonstrates promise in navigating trade-offs across diverse design scenarios, discrepancies between surrogate model predictions and full-order simulations remain. Further improvements are anticipated through constraint relaxation and surrogate development, constituting an ongoing area of investigation.
Related papers
- ProAct: Agentic Lookahead in Interactive Environments [56.50613398808361]
ProAct is a framework that enables agents to internalize accurate lookahead reasoning through a two-stage training paradigm.<n>We introduce Grounded LookAhead Distillation (GLAD), where the agent undergoes supervised fine-tuning on trajectories derived from environment-based search.<n>We also propose the Monte-Carlo Critic (MC-Critic), a plug-and-play auxiliary value estimator designed to enhance policy-gradient algorithms.
arXiv Detail & Related papers (2026-02-05T05:45:16Z) - How to Set the Learning Rate for Large-Scale Pre-training? [73.03133634525635]
We formalize this investigation into two distinct research paradigms: Fitting and Transfer.<n>Within the Fitting Paradigm, we introduce a Scaling Law for search factor, effectively reducing the search complexity from O(n3) to O(n*C_D*C_) via predictive modeling.<n>We extend the principles of $$Transfer to the Mixture of Experts (MoE) architecture, broadening its applicability to encompass model depth, weight decay, and token horizons.
arXiv Detail & Related papers (2026-01-08T15:55:13Z) - Techno-economic optimization of a heat-pipe microreactor, part I: theory and cost optimization [0.0]
Microreactors are well-suited for access-challenged remote areas where costly fuels dominate.<n>They suffer from diseconomies of scale, and their financial viability remains unconvincing.<n>We present a novel unifying geometric design optimization approach that accounts for techno-economic considerations.
arXiv Detail & Related papers (2025-12-17T23:28:13Z) - Momentum-constrained Hybrid Heuristic Trajectory Optimization Framework with Residual-enhanced DRL for Visually Impaired Scenarios [4.735413508037063]
This paper proposes a momentum-constrained hybrid trajectory optimization framework (MHHTOF) tailored for assistive navigation in visually impaired scenarios.<n>It integrates trajectory sampling generation, optimization and evaluation with residual deep reinforcement learning (DRL)<n> Experimental results demonstrate that the proposed LSTM-BResPPO achieves significantly faster convergence, attaining stable policy performance in approximately half the training required by the PPO.
arXiv Detail & Related papers (2025-09-19T04:33:39Z) - Evaluation of Nuclear Microreactor Cost-competitiveness in Current Electricity Markets Considering Reactor Cost Uncertainties [2.2002244657481826]
This paper evaluates the cost competitiveness of microreactors in today's electricity markets, with a focus on uncertainties in reactor costs.<n>A Genetic Algorithm (GA) is used to optimize key technical parameters, such as reactor fuel enrichment, tail enrichment, refueling interval, and discharge burnup.<n>Results show that microreactors can remain cost-competitive, with Ls ranging from $48.21/MWh to $78.32/MWh when supported by the Production Tax Credit (PTC).<n>Compared to conventional nuclear, coal, and renewable sources like offshore wind, hydro, and biomass, optimized microre
arXiv Detail & Related papers (2025-06-16T11:04:48Z) - Accelerating RL for LLM Reasoning with Optimal Advantage Regression [52.0792918455501]
We propose a novel two-stage policy optimization framework that directly approximates the optimal advantage function.<n>$A$*-PO achieves competitive performance across a wide range of mathematical reasoning benchmarks.<n>It reduces training time by up to 2$times$ and peak memory usage by over 30% compared to PPO, GRPO, and REBEL.
arXiv Detail & Related papers (2025-05-27T03:58:50Z) - Fourier Neural Operator based surrogates for $CO_2$ storage in realistic geologies [57.23978190717341]
We develop a Neural Operator (FNO) based model for real-time, high-resolution simulation of $CO$ plume migration.<n>The model is trained on a comprehensive dataset generated from realistic subsurface parameters.<n>We present various strategies for improving the reliability of predictions from the model, which is crucial while assessing actual geological sites.
arXiv Detail & Related papers (2025-03-14T02:58:24Z) - The Dual-use Dilemma in LLMs: Do Empowering Ethical Capacities Make a Degraded Utility? [54.18519360412294]
Large Language Models (LLMs) must balance between rejecting harmful requests for safety and accommodating legitimate ones for utility.<n>This paper presents a Direct Preference Optimization (DPO) based alignment framework that achieves better overall performance.<n>We analyze experimental results obtained from testing DeepSeek-R1 on our benchmark and reveal the critical ethical concerns raised by this highly acclaimed model.
arXiv Detail & Related papers (2025-01-20T06:35:01Z) - Scalable Online Exploration via Coverability [45.66375686120087]
Exploration is a major challenge in reinforcement learning, especially for high-dimensional domains that require function approximation.
We introduce a new objective, $L_Coverage, which generalizes previous exploration schemes and supports three fundamental desideratas.
$L_Coverage enables the first computationally efficient model-based and model-free algorithms for online (reward-free or reward-driven) reinforcement learning in MDPs with low coverability.
arXiv Detail & Related papers (2024-03-11T10:14:06Z) - A Modifiable Architectural Design for Commercial Greenhouses Energy
Economic Dispatch Testbed [0.0]
Commercial greenhouses strive to minimize energy costs while addressing CO2 emissions.
This paper proposes an architectural design for an energy economic dispatch testbed for commercial greenhouses.
arXiv Detail & Related papers (2024-01-08T13:36:31Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - A Reinforcement Learning Approach for Process Parameter Optimization in
Additive Manufacturing [0.0]
The article introduces a Reinforcement Learning (RL) methodology transformed into an optimization problem in the realm of metal additive manufacturing.
An experimentally validated Eagar-Tsai formulation is used to emulate the Laser-Directed Energy Deposition environment.
The framework, therefore, provides a model-free approach to learning without any prior observations.
arXiv Detail & Related papers (2022-11-17T14:05:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.