Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control
- URL: http://arxiv.org/abs/2408.10970v1
- Date: Tue, 20 Aug 2024 16:02:54 GMT
- Title: Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control
- Authors: Poppy Collis, Ryan Singh, Paul F Kinghorn, Christopher L Buckley,
- Abstract summary: A class of hybrid state-space model known as recurrent switching linear dynamical systems (rSLDS) discovers meaningful behavioural units.
We propose that the rich representations formed by an rSLDS can provide useful abstractions for planning and control.
We present a novel hierarchical model-based algorithm inspired by Active Inference in which a discrete MDP sits above a low-level linear-quadratic controller.
- Score: 0.8749675983608172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work has demonstrated that a class of hybrid state-space model known as recurrent switching linear dynamical systems (rSLDS) discover meaningful behavioural units via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). Furthermore, they model how the underlying continuous states drive these discrete mode switches. We propose that the rich representations formed by an rSLDS can provide useful abstractions for planning and control. We present a novel hierarchical model-based algorithm inspired by Active Inference in which a discrete MDP sits above a low-level linear-quadratic controller. The recurrent transition dynamics learned by the rSLDS allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We successfully apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and non-trivial planning through the delineation of abstract sub-goals.
Related papers
- Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization [72.81319836138347]
Online continual learning (OCL) seeks to learn new tasks from data streams that appear only once, while retaining knowledge of previously learned tasks.
Most existing methods rely on replay, focusing on enhancing memory retention through regularization or distillation.
We introduce a plug-and-play module, S6MOD, which can be integrated into most existing methods and directly improve adaptability.
arXiv Detail & Related papers (2024-12-24T05:25:21Z) - Learning in Hybrid Active Inference Models [0.8749675983608172]
We present a novel hierarchical hybrid active inference agent in which a high-level discrete active inference planner sits above a low-level continuous active inference controller.
We make use of recent work in recurrent switching linear dynamical systems which implement end-to-end learning of meaningful discrete representations.
We apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and successful planning.
arXiv Detail & Related papers (2024-09-02T08:41:45Z) - Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL [57.202733701029594]
We propose Decision Mamba, a novel multi-grained state space model (SSM) with a self-evolving policy learning strategy.
To address these challenges, we propose Decision Mamba, a novel multi-grained state space model (SSM) with a self-evolving policy learning strategy.
To mitigate the overfitting issue on noisy trajectories, a self-evolving policy is proposed by using progressive regularization.
arXiv Detail & Related papers (2024-06-08T10:12:00Z) - Tensor Decompositions Meet Control Theory: Learning General Mixtures of
Linear Dynamical Systems [19.47235707806519]
We give a new approach to learning mixtures of linear dynamical systems based on tensor decompositions.
Our algorithm succeeds without strong separation conditions on the components, and can be used to compete with the Bayes optimal clustering of the trajectories.
arXiv Detail & Related papers (2023-07-13T03:00:01Z) - Self-Supervised Reinforcement Learning that Transfers using Random
Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards.
Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Linearization and Identification of Multiple-Attractors Dynamical System
through Laplacian Eigenmaps [8.161497377142584]
We propose a Graph-based spectral clustering method that takes advantage of a velocity-augmented kernel to connect data-points belonging to the same dynamics.
We prove that there always exist a set of 2-dimensional embedding spaces in which the sub-dynamics are linear, and n-dimensional embedding where they are quasi-linear.
We learn a diffeomorphism from the Laplacian embedding space to the original space and show that the Laplacian embedding leads to good reconstruction accuracy and a faster training time.
arXiv Detail & Related papers (2022-02-18T12:43:25Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - Learning the Linear Quadratic Regulator from Nonlinear Observations [135.66883119468707]
We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR.
In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs.
Our results constitute the first provable sample complexity guarantee for continuous control with an unknown nonlinearity in the system model and general function approximation.
arXiv Detail & Related papers (2020-10-08T07:02:47Z) - Hierarchical Decomposition of Nonlinear Dynamics and Control for System
Identification and Policy Distillation [39.83837705993256]
Current trends in reinforcement learning (RL) focus on complex representations of dynamics and policies.
We take inspiration from the control community and apply the principles of hybrid switching systems in order to break down complex dynamics into simpler components.
arXiv Detail & Related papers (2020-05-04T12:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.