Related papers: Numerical Demonstration of Multiple Actuator Constraint Enforcement Algorithm for a Molten Salt Loop

Numerical Demonstration of Multiple Actuator Constraint Enforcement Algorithm for a Molten Salt Loop

URL: http://arxiv.org/abs/2202.02094v1
Date: Fri, 4 Feb 2022 11:58:40 GMT
Title: Numerical Demonstration of Multiple Actuator Constraint Enforcement Algorithm for a Molten Salt Loop
Authors: Akshay J. Dave, Haoyu Wang, Roberto Ponciroli, Richard B. Vilim
Abstract summary: We will demonstrate an interpretable and adaptable data-driven machine learning approach to autonomous control of a molten salt loop. To address adaptability, a control algorithm will be utilized to modify actuator setpoints while enforcing constant, and time-dependent constraints.
Score: 5.6006269492683725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To advance the paradigm of autonomous operation for nuclear power plants, a data-driven machine learning approach to control is sought. Autonomous operation for next-generation reactor designs is anticipated to bolster safety and improve economics. However, any algorithms that are utilized need to be interpretable, adaptable, and robust. In this work, we focus on the specific problem of optimal control during autonomous operation. We will demonstrate an interpretable and adaptable data-driven machine learning approach to autonomous control of a molten salt loop. To address interpretability, we utilize a data-driven algorithm to identify system dynamics in state-space representation. To address adaptability, a control algorithm will be utilized to modify actuator setpoints while enforcing constant, and time-dependent constraints. Robustness is not addressed in this work, and is part of future work. To demonstrate the approach, we designed a numerical experiment requiring intervention to enforce constraints during a load-follow type transient.

Related papers

Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks. We introduce a generative framework leveraging flow matching for online robot dynamics model alignment. We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z)
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z)
Scenario-based Thermal Management Parametrization Through Deep Reinforcement Learning [0.4218593777811082]
This paper introduces a learning-based tuning approach for thermal management functions. Our deep reinforcement learning agent processes the tuning task context and incorporates an image-based interpretation of embedded parameter sets. We demonstrate its applicability to a valve controller parametrization task and verify it in real-world vehicle testing.
arXiv Detail & Related papers (2024-08-04T13:19:45Z)
Distributed Robust Learning based Formation Control of Mobile Robots based on Bioinspired Neural Dynamics [14.149584412213269]
We first introduce a distributed estimator using a variable structure and cascaded design technique, eliminating the need for derivative information to improve the real time performance. Then, a kinematic tracking control method is developed utilizing a bioinspired neural dynamic-based approach aimed at providing smooth control inputs and effectively resolving the speed jump issue. To address the challenges for robots operating with completely unknown dynamics and disturbances, a learning-based robust dynamic controller is developed.
arXiv Detail & Related papers (2024-03-23T04:36:12Z)
A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants [7.1771300511732585]
Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks. We propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control. Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.
arXiv Detail & Related papers (2024-01-23T17:52:49Z)
Active Uncertainty Reduction for Safe and Efficient Interaction Planning: A Shielding-Aware Dual Control Approach [9.07774184840379]
We present a novel algorithmic approach to enable active uncertainty reduction for interactive motion planning based on the implicit dual control paradigm. Our approach relies on sampling-based approximation of dynamic programming, leading to a model predictive control problem that can be readily solved by real-time gradient-based optimization methods.
arXiv Detail & Related papers (2023-02-01T01:34:48Z)
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments. We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance. Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z)
Robust Value Iteration for Continuous Control Tasks [99.00362538261972]
When transferring a control policy from simulation to a physical system, the policy needs to be robust to variations in the dynamics to perform well. We present Robust Fitted Value Iteration, which uses dynamic programming to compute the optimal value function on the compact state domain. We show that robust value is more robust compared to deep reinforcement learning algorithm and the non-robust version of the algorithm.
arXiv Detail & Related papers (2021-05-25T19:48:35Z)
Strictly Batch Imitation Learning by Energy-based Distribution Matching [104.33286163090179]
Consider learning a policy purely on the basis of demonstrated behavior -- that is, with no access to reinforcement signals, no knowledge of transition dynamics, and no further interaction with the environment. One solution is simply to retrofit existing algorithms for apprenticeship learning to work in the offline setting. But such an approach leans heavily on off-policy evaluation or offline model estimation, and can be indirect and inefficient. We argue that a good solution should be able to explicitly parameterize a policy, implicitly learn from rollout dynamics, and operate in an entirely offline fashion.
arXiv Detail & Related papers (2020-06-25T03:27:59Z)
Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning [85.13138591433635]
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques.
arXiv Detail & Related papers (2020-04-15T18:15:49Z)
Thinking While Moving: Deep Reinforcement Learning with Concurrent Control [122.49572467292293]
We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system. Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed.
arXiv Detail & Related papers (2020-04-13T17:49:29Z)
Online Constrained Model-based Reinforcement Learning [13.362455603441552]
Key requirement is the ability to handle continuous state and action spaces while remaining within a limited time and resource budget. We propose a model based approach that combines Gaussian Process regression and Receding Horizon Control. We test our approach on a cart pole swing-up environment and demonstrate the benefits of online learning on an autonomous racing task.
arXiv Detail & Related papers (2020-04-07T15:51:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.