Meta-Learning Online Control for Linear Dynamical Systems
- URL: http://arxiv.org/abs/2208.10259v1
- Date: Thu, 18 Aug 2022 20:44:07 GMT
- Title: Meta-Learning Online Control for Linear Dynamical Systems
- Authors: Deepan Muthirayan, Dileep Kalathil, and Pramod P. Khargonekar
- Abstract summary: We propose a meta-learning online control algorithm for the control setting.
We characterize its performance by textitmeta-regret, the average cumulative regret across the tasks.
We show that when the number of tasks are sufficiently large, our proposed approach achieves a meta-regret that is smaller by a factor $D/D*$ compared to an independent-learning online control algorithm.
- Score: 2.867517731896504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider the problem of finding a meta-learning online
control algorithm that can learn across the tasks when faced with a sequence of
$N$ (similar) control tasks. Each task involves controlling a linear dynamical
system for a finite horizon of $T$ time steps. The cost function and system
noise at each time step are adversarial and unknown to the controller before
taking the control action. Meta-learning is a broad approach where the goal is
to prescribe an online policy for any new unseen task exploiting the
information from other tasks and the similarity between the tasks. We propose a
meta-learning online control algorithm for the control setting and characterize
its performance by \textit{meta-regret}, the average cumulative regret across
the tasks. We show that when the number of tasks are sufficiently large, our
proposed approach achieves a meta-regret that is smaller by a factor $D/D^{*}$
compared to an independent-learning online control algorithm which does not
perform learning across the tasks, where $D$ is a problem constant and $D^{*}$
is a scalar that decreases with increase in the similarity between tasks. Thus,
when the sequence of tasks are similar the regret of the proposed meta-learning
online control is significantly lower than that of the naive approaches without
meta-learning. We also present experiment results to demonstrate the superior
performance achieved by our meta-learning algorithm.
Related papers
- ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning [49.447777286862994]
ConML is a universal meta-learning framework that can be applied to various meta-learning algorithms.
We demonstrate that ConML integrates seamlessly with optimization-based, metric-based, and amortization-based meta-learning algorithms.
arXiv Detail & Related papers (2024-10-08T12:22:10Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Skill-based Meta-Reinforcement Learning [65.31995608339962]
We devise a method that enables meta-learning on long-horizon, sparse-reward tasks.
Our core idea is to leverage prior experience extracted from offline datasets during meta-learning.
arXiv Detail & Related papers (2022-04-25T17:58:19Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - Dynamic Regret Analysis for Online Meta-Learning [0.0]
The online meta-learning framework has arisen as a powerful tool for the continual lifelong learning setting.
This formulation involves two levels: outer level which learns meta-learners and inner level which learns task-specific models.
We establish performance in terms of dynamic regret which handles changing environments from a global prospective.
We carry out our analyses in a setting, and in expectation prove a logarithmic local dynamic regret which explicitly depends on the total number of iterations.
arXiv Detail & Related papers (2021-09-29T12:12:59Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - A Meta-Reinforcement Learning Approach to Process Control [3.9146761527401424]
Meta-learning aims to quickly adapt models, such as neural networks, to perform new tasks.
We construct a controller and meta-train the controller using a latent context variable through a separate embedding neural network.
In both cases, our meta-learning algorithm adapts very quickly to new tasks, outperforming a regular DRL controller trained from scratch.
arXiv Detail & Related papers (2021-03-25T18:20:56Z) - Variable-Shot Adaptation for Online Meta-Learning [123.47725004094472]
We study the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.
We find that meta-learning solves the full task set with fewer overall labels and greater cumulative performance, compared to standard supervised methods.
These results suggest that meta-learning is an important ingredient for building learning systems that continuously learn and improve over a sequence of problems.
arXiv Detail & Related papers (2020-12-14T18:05:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.