Coordinating Planning and Tracking in Layered Control Policies via Actor-Critic Learning
- URL: http://arxiv.org/abs/2408.01639v1
- Date: Sat, 3 Aug 2024 02:53:24 GMT
- Title: Coordinating Planning and Tracking in Layered Control Policies via Actor-Critic Learning
- Authors: Fengjun Yang, Nikolai Matni,
- Abstract summary: We propose a reinforcement learning (RL)-based algorithm to jointly train (1) a trajectory planner and (2) a tracking controller in a layered control architecture.
Our algorithm arises naturally from a rewrite of the underlying optimal control problem that lends itself to an actor-critic learning approach.
- Score: 5.229380825007323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a reinforcement learning (RL)-based algorithm to jointly train (1) a trajectory planner and (2) a tracking controller in a layered control architecture. Our algorithm arises naturally from a rewrite of the underlying optimal control problem that lends itself to an actor-critic learning approach. By explicitly learning a \textit{dual} network to coordinate the interaction between the planning and tracking layers, we demonstrate the ability to achieve an effective consensus between the two components, leading to an interpretable policy. We theoretically prove that our algorithm converges to the optimal dual network in the Linear Quadratic Regulator (LQR) setting and empirically validate its applicability to nonlinear systems through simulation experiments on a unicycle model.
Related papers
- Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Towards a Theoretical Foundation of Policy Optimization for Learning
Control Policies [26.04704565406123]
Gradient-based methods have been widely used for system design and optimization in diverse application domains.
Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning.
This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis.
arXiv Detail & Related papers (2022-10-10T16:13:34Z) - Deep Learning for Wireless Networked Systems: a joint
Estimation-Control-Scheduling Approach [47.29474858956844]
Wireless networked control system (WNCS) connecting sensors, controllers, and actuators via wireless communications is a key enabling technology for highly scalable and low-cost deployment of control systems in the Industry 4.0 era.
Despite the tight interaction of control and communications in WNCSs, most existing works adopt separative design approaches.
We propose a novel deep reinforcement learning (DRL)-based algorithm for controller and optimization utilizing both model-free and model-based data.
arXiv Detail & Related papers (2022-10-03T01:29:40Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and
Heuristic Rule-based Methods for Object Manipulation [118.27432851053335]
This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track.
The No Interaction track targets for learning policies from pre-collected demonstration trajectories.
In this track, we design a Heuristic Rule-based Method (HRM) to trigger high-quality object manipulation by decomposing the task into a series of sub-tasks.
For each sub-task, the simple rule-based controlling strategies are adopted to predict actions that can be applied to robotic arms.
arXiv Detail & Related papers (2022-06-13T16:20:42Z) - Learning Optimal Antenna Tilt Control Policies: A Contextual Linear
Bandit Approach [65.27783264330711]
Controlling antenna tilts in cellular networks is imperative to reach an efficient trade-off between network coverage and capacity.
We devise algorithms learning optimal tilt control policies from existing data.
We show that they can produce optimal tilt update policy using much fewer data samples than naive or existing rule-based learning algorithms.
arXiv Detail & Related papers (2022-01-06T18:24:30Z) - Compositional Reinforcement Learning from Logical Specifications [21.193231846438895]
Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy.
We develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning.
Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph.
arXiv Detail & Related papers (2021-06-25T22:54:28Z) - Integrated Decision and Control: Towards Interpretable and Efficient
Driving Intelligence [13.589285628074542]
We present an interpretable and efficient decision and control framework for automated vehicles.
It decomposes the driving task into multi-path planning and optimal tracking that are structured hierarchically.
Results show that our method has better online computing efficiency and driving performance including traffic efficiency and safety.
arXiv Detail & Related papers (2021-03-18T14:43:31Z) - Learning Event-triggered Control from Data through Joint Optimization [7.391641422048646]
We present a framework for model-free learning of event-triggered control strategies.
We propose a novel algorithm based on hierarchical reinforcement learning.
The resulting algorithm is shown to accomplish high-performance control in line with resource savings and scales seamlessly to nonlinear and high-dimensional systems.
arXiv Detail & Related papers (2020-08-11T14:15:38Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.