Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
- URL: http://arxiv.org/abs/2508.16474v1
- Date: Fri, 22 Aug 2025 15:42:03 GMT
- Title: Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
- Authors: Austin Braniff, Yuhe Tian,
- Abstract summary: This work presents a novel reinforcement learning (RL) algorithm based on Y-wise Affine Neural Networks (YANNs)<n>YANNs provide an interpretable neural network which can represent known piecewise affine functions of arbitrary input and output dimensions.<n>The YANN-RL algorithm is demonstrated on a clipped and a safety-critical chemical system.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents a novel reinforcement learning (RL) algorithm based on Y-wise Affine Neural Networks (YANNs). YANNs provide an interpretable neural network which can exactly represent known piecewise affine functions of arbitrary input and output dimensions defined on any amount of polytopic subdomains. One representative application of YANNs is to reformulate explicit solutions of multi-parametric linear model predictive control. Built on this, we propose the use of YANNs to initialize RL actor and critic networks, which enables the resulting YANN-RL control algorithm to start with the confidence of linear optimal control. The YANN-actor is initialized by representing the multi-parametric control solutions obtained via offline computation using an approximated linear system model. The YANN-critic represents the explicit form of the state-action value function for the linear system and the reward function as the objective in an optimal control problem (OCP). Additional network layers are injected to extend YANNs for nonlinear expressions, which can be trained online by directly interacting with the true complex nonlinear system. In this way, both the policy and state-value functions exactly represent a linear OCP initially and are able to eventually learn the solution of a general nonlinear OCP. Continuous policy improvement is also implemented to provide heuristic confidence that the linear OCP solution serves as an effective lower bound to the performance of RL policy. The YANN-RL algorithm is demonstrated on a clipped pendulum and a safety-critical chemical-reactive system. Our results show that YANN-RL significantly outperforms the modern RL algorithm using deep deterministic policy gradient, especially when considering safety constraints.
Related papers
- YANNs: Y-wise Affine Neural Networks for Exact and Efficient Representations of Piecewise Linear Functions [0.0]
Y-wise Affine Neural Networks (YANNs) are a fully-explainable network architecture that represent piecewise affine functions with polytopic.<n>YANNs maintain all mathematical properties of the original formulations.<n>They theoretically computes optimal control laws as a piecewise affine function of states, outputs, setpoints, and disturbances.
arXiv Detail & Related papers (2025-05-11T16:55:38Z) - Graph Neural Network-Based Distributed Optimal Control for Linear Networked Systems: An Online Distributed Training Approach [2.899475960472822]
We consider the distributed optimal control problem for discrete-time linear networked systems using graph recurrent neural networks (GRNNs)<n>We first propose a GRNN-based distributed optimal control method, and we cast the problem as a self-supervised learning problem. Then distributed online training is achieved via computation inspired by the (consensus-based) gradient, and a distributed online training is designed. Furthermore, the closed-loop stability of the linear networked system under our proposed GRNN-based controller is provided by assuming that the nonlinear activation function of the GRNN-based controller is both local sector-bounded and slope-restricted
arXiv Detail & Related papers (2025-04-08T21:18:43Z) - A Guaranteed-Stable Neural Network Approach for Optimal Control of Nonlinear Systems [3.5000297213981653]
A promising approach to optimal control of nonlinear systems involves iteratively linearizing the system and solving an optimization problem at each time instant to determine the optimal control input.<n>Since this approach relies on online optimization, it can be computationally expensive, and thus unrealistic for systems with limited computing resources.<n>One potential solution to this issue is to incorporate a Neural Network (NN) into the control loop.
arXiv Detail & Related papers (2025-01-28T22:55:47Z) - Linearization of ReLU Activation Function for Neural Network-Embedded Optimization: Optimal Day-Ahead Energy Scheduling [5.254482407158516]
In some applications such as battery degradation neural network-based microgrid day-ahead energy scheduling, the input features of the trained learning model are variables to be solved in optimization models.<n>The use of nonlinear activation functions in the neural network will make such problems extremely hard to solve if not unsolvable.<n>Four linearization methods tailored for the ReLU activation function are developed, analyzed and compared in this paper.
arXiv Detail & Related papers (2023-10-03T02:47:38Z) - Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning [53.97335841137496]
We propose an oracle-efficient algorithm, dubbed Pessimistic Least-Square Value Iteration (PNLSVI) for offline RL with non-linear function approximation.
Our algorithm enjoys a regret bound that has a tight dependency on the function class complexity and achieves minimax optimal instance-dependent regret when specialized to linear function approximation.
arXiv Detail & Related papers (2023-10-02T17:42:01Z) - Offline Policy Optimization in RL with Variance Regularizaton [142.87345258222942]
We propose variance regularization for offline RL algorithms, using stationary distribution corrections.
We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer.
The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms.
arXiv Detail & Related papers (2022-12-29T18:25:01Z) - Policy Gradient for Reinforcement Learning with General Utilities [50.65940899590487]
In Reinforcement Learning (RL), the goal of agents is to discover an optimal policy that maximizes the expected cumulative rewards.
Many supervised and unsupervised RL problems are not covered in the Linear RL framework.
We derive the policy gradient theorem for RL with general utilities.
arXiv Detail & Related papers (2022-10-03T14:57:46Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Sample-Efficient Reinforcement Learning Is Feasible for Linearly
Realizable MDPs with Limited Revisiting [60.98700344526674]
Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning.
In this paper, we investigate a new sampling protocol, which draws samples in an online/exploratory fashion but allows one to backtrack and revisit previous states in a controlled and infrequent manner.
We develop an algorithm tailored to this setting, achieving a sample complexity that scales practicallyly with the feature dimension, the horizon, and the inverse sub-optimality gap, but not the size of the state/action space.
arXiv Detail & Related papers (2021-05-17T17:22:07Z) - A Heuristically Assisted Deep Reinforcement Learning Approach for
Network Slice Placement [0.7885276250519428]
We introduce a hybrid placement solution based on Deep Reinforcement Learning (DRL) and a dedicated optimization based on the Power of Two Choices principle.
The proposed Heuristically-Assisted DRL (HA-DRL) allows to accelerate the learning process and gain in resource usage when compared against other state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-14T10:04:17Z) - Resource Allocation via Graph Neural Networks in Free Space Optical
Fronthaul Networks [119.81868223344173]
This paper investigates the optimal resource allocation in free space optical (FSO) fronthaul networks.
We consider the graph neural network (GNN) for the policy parameterization to exploit the FSO network structure.
The primal-dual learning algorithm is developed to train the GNN in a model-free manner, where the knowledge of system models is not required.
arXiv Detail & Related papers (2020-06-26T14:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.