Related papers: SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning

SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning

URL: http://arxiv.org/abs/2403.09110v1
Date: Thu, 14 Mar 2024 05:17:39 GMT
Title: SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning
Authors: Nicholas Zolman, Urban Fasel, J. Nathan Kutz, Steven L. Brunton,
Abstract summary: We introduce SINDy-RL, a framework for combining SINDy and deep reinforcement learning. SINDy-RL achieves comparable performance to state-of-the-art DRL algorithms. We demonstrate the effectiveness of our approaches on benchmark control environments and challenging fluids problems.
Score: 5.59265003686955
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep reinforcement learning (DRL) has shown significant promise for uncovering sophisticated control policies that interact in environments with complicated dynamics, such as stabilizing the magnetohydrodynamics of a tokamak fusion reactor or minimizing the drag force exerted on an object in a fluid flow. However, these algorithms require an abundance of training examples and may become prohibitively expensive for many applications. In addition, the reliance on deep neural networks often results in an uninterpretable, black-box policy that may be too computationally expensive to use with certain embedded systems. Recent advances in sparse dictionary learning, such as the sparse identification of nonlinear dynamics (SINDy), have shown promise for creating efficient and interpretable data-driven models in the low-data regime. In this work we introduce SINDy-RL, a unifying framework for combining SINDy and DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy. We demonstrate the effectiveness of our approaches on benchmark control environments and challenging fluids problems. SINDy-RL achieves comparable performance to state-of-the-art DRL algorithms using significantly fewer interactions in the environment and results in an interpretable control policy orders of magnitude smaller than a deep neural network policy.

Related papers

Multi-fidelity Reinforcement Learning Control for Complex Dynamical Systems [42.2790464348673]
We propose a multi-fidelity reinforcement learning framework for controlling instabilities in complex systems. The effect of the proposed framework is demonstrated on two complex dynamics in physics.
arXiv Detail & Related papers (2025-04-08T00:50:15Z)
Interpretable and Efficient Data-driven Discovery and Control of Distributed Systems [1.5195865840919498]
Reinforcement Learning (RL) has emerged as a promising control paradigm for systems with high-dimensional, nonlinear dynamics. We propose a data-efficient, interpretable, and scalable framework for PDE control.
arXiv Detail & Related papers (2024-11-06T18:26:19Z)
Learning from Demonstration with Implicit Nonlinear Dynamics Models [16.26835655544884]
We develop a recurrent neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties for modelling temporal dynamics. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting dataset.
arXiv Detail & Related papers (2024-09-27T14:12:49Z)
Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression [53.33734159983431]
This paper introduces a novel approach to distill neural RL policies into more interpretable forms. We train expert neural network policies using RL and distill them into (i) GBMs, (ii) EBMs, and (iii) symbolic policies.
arXiv Detail & Related papers (2024-03-21T11:54:45Z)
Hybrid Reinforcement Learning for Optimizing Pump Sustainability in Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs) Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs. Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z)
Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning [0.0]
"DManD-RL" (data-driven manifold dynamics-RL) generates a data-driven low-dimensional model of our system. We train an RL control agent, yielding a 440-fold speedup over training on a numerical simulation. The agent learns a policy that laminarizes 84% of unseen DNS test trajectories within 900 time units.
arXiv Detail & Related papers (2023-01-28T05:47:10Z)
Efficient Learning of Voltage Control Strategies via Model-based Deep Reinforcement Learning [9.936452412191326]
This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems. Recent advances show promising results in model-free DRL-based methods for power systems, but model-free methods suffer from poor sample efficiency and training time. We propose a novel model-based-DRL framework where a deep neural network (DNN)-based dynamic surrogate model is utilized with the policy learning framework.
arXiv Detail & Related papers (2022-12-06T02:50:53Z)
Data-driven control of spatiotemporal chaos with reduced-order neural ODE-based models and reinforcement learning [0.0]
Deep learning is capable of discovering complex control strategies for high-dimensional systems, making it promising for flow control applications. A major challenge associated with RL is that substantial training data must be generated by repeatedly interacting with the target system. We use a data-driven reduced-order model (ROM) in place the true system during RL training to efficiently estimate the optimal policy. We show that the ROM-based control strategy translates well to the true KSE and highlight that the RL agent discovers and stabilizes an underlying forced equilibrium solution of the KSE system.
arXiv Detail & Related papers (2022-05-01T23:25:44Z)
Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC) Our algorithm alleviates problems with local minima through a smooth critic function. We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z)
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments. We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance. Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z)
Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives. It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise. We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.