Online Control-Informed Learning
- URL: http://arxiv.org/abs/2410.03924v1
- Date: Fri, 4 Oct 2024 21:03:16 GMT
- Title: Online Control-Informed Learning
- Authors: Zihao Liang, Tianyu Zhou, Zehui Lu, Shaoshuai Mou,
- Abstract summary: This paper proposes an Online Control-Informed Learning framework to solve a broad class of learning and control tasks in real time.
By considering any robot as a tunable optimal control system, we propose an online parameter estimator based on extended Kalman filter (EKF)
The proposed method also improves robustness in learning by effectively managing noise in the data.
- Score: 4.907545537403502
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes an Online Control-Informed Learning (OCIL) framework, which synthesizes the well-established control theories to solve a broad class of learning and control tasks in real time. This novel integration effectively handles practical issues in machine learning such as noisy measurement data, online learning, and data efficiency. By considering any robot as a tunable optimal control system, we propose an online parameter estimator based on extended Kalman filter (EKF) to incrementally tune the system in real time, enabling it to complete designated learning or control tasks. The proposed method also improves robustness in learning by effectively managing noise in the data. Theoretical analysis is provided to demonstrate the convergence and regret of OCIL. Three learning modes of OCIL, i.e. Online Imitation Learning, Online System Identification, and Policy Tuning On-the-fly, are investigated via experiments, which validate their effectiveness.
Related papers
- Tracking Control for a Spherical Pendulum via Curriculum Reinforcement
Learning [27.73555826776087]
Reinforcement Learning (RL) allows learning non-trivial robot control laws purely from data.
In this paper, we pair a recent algorithm for automatically building curricula with RL on massively parallelized simulations.
We demonstrate the potential of curriculum RL to jointly learn state estimation and control for non-linear tracking tasks.
arXiv Detail & Related papers (2023-09-25T12:48:47Z) - Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks.
We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations.
We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Efficient Online Learning with Memory via Frank-Wolfe Optimization:
Algorithms with Bounded Dynamic Regret and Applications to Control [15.588080817106563]
We introduce the first projection-free meta-base learning algorithm with memory that minimizes dynamic regret.
We are motivated by artificial intelligence applications where autonomous agents need to adapt to time-varying environments.
arXiv Detail & Related papers (2023-01-02T01:12:29Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - Fault-Tolerant Control of Degrading Systems with On-Policy Reinforcement
Learning [1.8799681615947088]
We propose a novel adaptive reinforcement learning control approach for fault tolerant systems.
Online and offline learning are combined to improve exploration and sample efficiency.
We conduct experiments on an aircraft fuel transfer system to demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-08-10T20:42:59Z) - Anticipating the Long-Term Effect of Online Learning in Control [75.6527644813815]
AntLer is a design algorithm for learning-based control laws that anticipates learning.
We show that AntLer approximates an optimal solution arbitrarily accurately with probability one.
arXiv Detail & Related papers (2020-07-24T07:00:14Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z) - Reinforcement Learning Control of Robotic Knee with Human in the Loop by
Flexible Policy Iteration [17.365135977882215]
This study fills important voids by introducing innovative features to the policy algorithm.
We show system level performances including convergence of the approximate value function, (sub)optimality of the solution, and stability of the system.
arXiv Detail & Related papers (2020-06-16T09:09:48Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.