Nuclear Microreactor Control with Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2504.00156v1
- Date: Mon, 31 Mar 2025 19:11:19 GMT
- Title: Nuclear Microreactor Control with Deep Reinforcement Learning
- Authors: Leo Tunkle, Kamal Abdulraheem, Linyu Lin, Majdi I. Radaideh,
- Abstract summary: This study explores the application of deep reinforcement learning (RL) for real-time drum control in microreactors.<n>RL controllers can achieve similar or even superior load-following performance as traditional proportional-integral-derivative (PID) controllers.
- Score: 0.40498500266986387
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The economic feasibility of nuclear microreactors will depend on minimizing operating costs through advancements in autonomous control, especially when these microreactors are operating alongside other types of energy systems (e.g., renewable energy). This study explores the application of deep reinforcement learning (RL) for real-time drum control in microreactors, exploring performance in regard to load-following scenarios. By leveraging a point kinetics model with thermal and xenon feedback, we first establish a baseline using a single-output RL agent, then compare it against a traditional proportional-integral-derivative (PID) controller. This study demonstrates that RL controllers, including both single- and multi-agent RL (MARL) frameworks, can achieve similar or even superior load-following performance as traditional PID control across a range of load-following scenarios. In short transients, the RL agent was able to reduce the tracking error rate in comparison to PID. Over extended 300-minute load-following scenarios in which xenon feedback becomes a dominant factor, PID maintained better accuracy, but RL still remained within a 1% error margin despite being trained only on short-duration scenarios. This highlights RL's strong ability to generalize and extrapolate to longer, more complex transients, affording substantial reductions in training costs and reduced overfitting. Furthermore, when control was extended to multiple drums, MARL enabled independent drum control as well as maintained reactor symmetry constraints without sacrificing performance -- an objective that standard single-agent RL could not learn. We also found that, as increasing levels of Gaussian noise were added to the power measurements, the RL controllers were able to maintain lower error rates than PID, and to do so with less control effort.
Related papers
- Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime [5.138559709801885]
We study the effectiveness of Reinforcement Learning (RL) for reducing convective heat transfer under increasing turbulence.
RL agents trained via single-agent Proximal Policy Optimization (PPO) are compared to linear proportional derivative (PD) controllers.
The RL agents reduced convection, measured by the Nusselt Number, by up to 33% in moderately turbulent systems and 10% in highly turbulent settings.
arXiv Detail & Related papers (2025-04-16T11:51:59Z) - FitLight: Federated Imitation Learning for Plug-and-Play Autonomous Traffic Signal Control [33.547772623142414]
Reinforcement Learning (RL)-based Traffic Signal Control (TSC) methods raise some serious issues such as high learning cost and poor generalizability.<n>We propose a novel Federated Imitation Learning (FIL)-based framework for multi-intersection TSC, named FitLight.<n> FitLight allows real-time imitation learning and seamless transition to reinforcement learning.
arXiv Detail & Related papers (2025-02-17T15:48:46Z) - Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning [0.3562485774739681]
We introduce the use of reinforcement learning (RL) algorithms for intelligent control in nuclear microreactors.
RL agent is trained using proximal policy optimization (PPO) and advantage actor-critic (A2C)
Results demonstrate the excellent performance of PPO in identifying optimal drum positions.
arXiv Detail & Related papers (2024-06-22T20:14:56Z) - Compressing Deep Reinforcement Learning Networks with a Dynamic
Structured Pruning Method for Autonomous Driving [63.155562267383864]
Deep reinforcement learning (DRL) has shown remarkable success in complex autonomous driving scenarios.
DRL models inevitably bring high memory consumption and computation, which hinders their wide deployment in resource-limited autonomous driving devices.
We introduce a novel dynamic structured pruning approach that gradually removes a DRL model's unimportant neurons during the training stage.
arXiv Detail & Related papers (2024-02-07T09:00:30Z) - Deployable Reinforcement Learning with Variable Control Rate [14.838483990647697]
We propose a variant of Reinforcement Learning (RL) with variable control rate.
In this approach, the policy decides the action the agent should take as well as the duration of the time step associated with that action.
We show the efficacy of SEAC through a proof-of-concept simulation driving an agent with Newtonian kinematics.
arXiv Detail & Related papers (2024-01-17T15:40:11Z) - Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam
Intensity Control in Mu2e [3.860979702631594]
We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab)
Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an automated controller capable of providing real-time feedback and calibration of the Spill Regulation System (SRS) parameters on a millisecond timescale.
arXiv Detail & Related papers (2023-12-28T21:35:20Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Direct Preference Optimization: Your Language Model is Secretly a Reward Model [119.65409513119963]
We introduce a new parameterization of the reward model in RLHF that enables extraction of the corresponding optimal policy in closed form.
The resulting algorithm, which we call Direct Preference Optimization (DPO), is stable, performant, and computationally lightweight.
Our experiments show that DPO can fine-tune LMs to align with human preferences as well as or better than existing methods.
arXiv Detail & Related papers (2023-05-29T17:57:46Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Skip Training for Multi-Agent Reinforcement Learning Controller for
Industrial Wave Energy Converters [94.84709449845352]
Recent Wave Energy Converters (WEC) are equipped with multiple legs and generators to maximize energy generation.
Traditional controllers have shown limitations to capture complex wave patterns and the controllers must efficiently maximize the energy capture.
This paper introduces a Multi-Agent Reinforcement Learning controller (MARL), which outperforms the traditionally used spring damper controller.
arXiv Detail & Related papers (2022-09-13T00:20:31Z) - Low Emission Building Control with Zero-Shot Reinforcement Learning [70.70479436076238]
Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency.
We show it is possible to obtain emission-reducing policies without a priori--a paradigm we call zero-shot building control.
arXiv Detail & Related papers (2022-08-12T17:13:25Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.