Koopman-Assisted Reinforcement Learning
- URL: http://arxiv.org/abs/2403.02290v1
- Date: Mon, 4 Mar 2024 18:19:48 GMT
- Title: Koopman-Assisted Reinforcement Learning
- Authors: Preston Rozwood, Edward Mehrez, Ludger Paehler, Wen Sun, Steven L.
Brunton
- Abstract summary: The Bellman equation and its continuous form, the Hamilton-Jacobi-Bellman (HJB) equation, are ubiquitous in reinforcement learning (RL) and control theory.
This paper explores the connection between the data-driven Koopman operator and Decision Processes (MDPs)
We develop two new RL algorithms to address these limitations.
- Score: 8.812992091278668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Bellman equation and its continuous form, the Hamilton-Jacobi-Bellman
(HJB) equation, are ubiquitous in reinforcement learning (RL) and control
theory. However, these equations quickly become intractable for systems with
high-dimensional states and nonlinearity. This paper explores the connection
between the data-driven Koopman operator and Markov Decision Processes (MDPs),
resulting in the development of two new RL algorithms to address these
limitations. We leverage Koopman operator techniques to lift a nonlinear system
into new coordinates where the dynamics become approximately linear, and where
HJB-based methods are more tractable. In particular, the Koopman operator is
able to capture the expectation of the time evolution of the value function of
a given system via linear dynamics in the lifted coordinates. By parameterizing
the Koopman operator with the control actions, we construct a ``Koopman
tensor'' that facilitates the estimation of the optimal value function. Then, a
transformation of Bellman's framework in terms of the Koopman tensor enables us
to reformulate two max-entropy RL algorithms: soft value iteration and soft
actor-critic (SAC). This highly flexible framework can be used for
deterministic or stochastic systems as well as for discrete or continuous-time
dynamics. Finally, we show that these Koopman Assisted Reinforcement Learning
(KARL) algorithms attain state-of-the-art (SOTA) performance with respect to
traditional neural network-based SAC and linear quadratic regulator (LQR)
baselines on four controlled dynamical systems: a linear state-space system,
the Lorenz system, fluid flow past a cylinder, and a double-well potential with
non-isotropic stochastic forcing.
Related papers
- Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations [0.0]
Variational Autoencoders (VAEs) are a powerful framework for learning compact latent representations.
NeuralODEs excel in learning transient system dynamics.
This work combines the strengths of both to create fast surrogate models with adjustable complexity.
arXiv Detail & Related papers (2024-10-14T05:45:52Z) - Deep Learning for Structure-Preserving Universal Stable Koopman-Inspired
Embeddings for Nonlinear Canonical Hamiltonian Dynamics [9.599029891108229]
We focus on the identification of global linearized embeddings for canonical nonlinear Hamiltonian systems through a symplectic transformation.
To overcome the shortcomings of Koopman operators for systems with continuous spectra, we apply the lifting principle and learn global cubicized embeddings.
We demonstrate the capabilities of deep learning in acquiring compact symplectic coordinate transformation and the corresponding simple dynamical models.
arXiv Detail & Related papers (2023-08-26T09:58:09Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Towards Data-driven LQR with KoopmanizingFlows [8.133902705930327]
We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics.
We learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates.
arXiv Detail & Related papers (2022-01-27T17:02:03Z) - Supervised DKRC with Images for Offline System Identification [77.34726150561087]
Modern dynamical systems are becoming increasingly non-linear and complex.
There is a need for a framework to model these systems in a compact and comprehensive representation for prediction and control.
Our approach learns these basis functions using a supervised learning approach.
arXiv Detail & Related papers (2021-09-06T04:39:06Z) - DySMHO: Data-Driven Discovery of Governing Equations for Dynamical
Systems via Moving Horizon Optimization [77.34726150561087]
We introduce Discovery of Dynamical Systems via Moving Horizon Optimization (DySMHO), a scalable machine learning framework.
DySMHO sequentially learns the underlying governing equations from a large dictionary of basis functions.
Canonical nonlinear dynamical system examples are used to demonstrate that DySMHO can accurately recover the governing laws.
arXiv Detail & Related papers (2021-07-30T20:35:03Z) - Estimating Koopman operators for nonlinear dynamical systems: a
nonparametric approach [77.77696851397539]
The Koopman operator is a mathematical tool that allows for a linear description of non-linear systems.
In this paper we capture their core essence as a dual version of the same framework, incorporating them into the Kernel framework.
We establish a strong link between kernel methods and Koopman operators, leading to the estimation of the latter through Kernel functions.
arXiv Detail & Related papers (2021-03-25T11:08:26Z) - CKNet: A Convolutional Neural Network Based on Koopman Operator for
Modeling Latent Dynamics from Pixels [5.286010070038216]
We present a convolutional neural network (CNN) based on the Koopman operator (CKNet) to identify the latent dynamics from raw pixels.
Experiments show that identified dynamics with 32-dim can predict validly 120 steps and generate clear images.
arXiv Detail & Related papers (2021-02-19T23:29:08Z) - Learning the Linear Quadratic Regulator from Nonlinear Observations [135.66883119468707]
We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR.
In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs.
Our results constitute the first provable sample complexity guarantee for continuous control with an unknown nonlinearity in the system model and general function approximation.
arXiv Detail & Related papers (2020-10-08T07:02:47Z) - Forecasting Sequential Data using Consistent Koopman Autoencoders [52.209416711500005]
A new class of physics-based methods related to Koopman theory has been introduced, offering an alternative for processing nonlinear dynamical systems.
We propose a novel Consistent Koopman Autoencoder model which, unlike the majority of existing work, leverages the forward and backward dynamics.
Key to our approach is a new analysis which explores the interplay between consistent dynamics and their associated Koopman operators.
arXiv Detail & Related papers (2020-03-04T18:24:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.