Learning to Reach, Swim, Walk and Fly in One Trial: Data-Driven Control
with Scarce Data and Side Information
- URL: http://arxiv.org/abs/2106.10533v1
- Date: Sat, 19 Jun 2021 17:10:27 GMT
- Title: Learning to Reach, Swim, Walk and Fly in One Trial: Data-Driven Control
with Scarce Data and Side Information
- Authors: Franck Djeumou and Ufuk Topcu
- Abstract summary: We develop a learning-based control algorithm for unknown dynamical systems under very severe data limitations.
Despite the scarcity of data, we show that the algorithm can provide performance comparable to reinforcement learning algorithms trained over millions of environment interactions.
Experiments in a high-fidelity F-16 aircraft simulator and MuJoCo's environments such as the Reacher, Swimmer, and Cheetah illustrate the algorithm's effectiveness.
- Score: 24.330188770135273
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We develop a learning-based control algorithm for unknown dynamical systems
under very severe data limitations. Specifically, the algorithm has access to
streaming data only from a single and ongoing trial. Despite the scarcity of
data, we show -- through a series of examples -- that the algorithm can provide
performance comparable to reinforcement learning algorithms trained over
millions of environment interactions. It accomplishes such performance by
effectively leveraging various forms of side information on the dynamics to
reduce the sample complexity. Such side information typically comes from
elementary laws of physics and qualitative properties of the system. More
precisely, the algorithm approximately solves an optimal control problem
encoding the system's desired behavior. To this end, it constructs and refines
a differential inclusion that contains the unknown vector field of the
dynamics. The differential inclusion, used in an interval Taylor-based method,
enables to over-approximate the set of states the system may reach.
Theoretically, we establish a bound on the suboptimality of the approximate
solution with respect to the case of known dynamics. We show that the longer
the trial or the more side information is available, the tighter the bound.
Empirically, experiments in a high-fidelity F-16 aircraft simulator and
MuJoCo's environments such as the Reacher, Swimmer, and Cheetah illustrate the
algorithm's effectiveness.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Continual Learning for Multimodal Data Fusion of a Soft Gripper [1.0589208420411014]
A model trained on one data modality often fails when tested with a different modality.
We introduce a continual learning algorithm capable of incrementally learning different data modalities.
We evaluate the algorithm's effectiveness on a challenging custom multimodal dataset.
arXiv Detail & Related papers (2024-09-20T09:53:27Z) - Limits and Powers of Koopman Learning [0.0]
Dynamical systems provide a comprehensive way to study complex and changing behaviors across various sciences.
Koopman operators have emerged as a dominant approach because they allow the study of nonlinear dynamics using linear techniques.
This paper addresses a fundamental open question: textitWhen can we robustly learn the spectral properties of Koopman operators from trajectory data of dynamical systems, and when can we not?
arXiv Detail & Related papers (2024-07-08T18:24:48Z) - Optimistic Active Exploration of Dynamical Systems [52.91573056896633]
We develop an algorithm for active exploration called OPAX.
We show how OPAX can be reduced to an optimal control problem that can be solved at each episode.
Our experiments show that OPAX is not only theoretically sound but also performs well for zero-shot planning on novel downstream tasks.
arXiv Detail & Related papers (2023-06-21T16:26:59Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems [6.612035830987298]
We introduce FLEX, an exploration algorithm for nonlinear dynamics based on optimal experimental design.
Our policy maximizes the information of the next step and results in an adaptive exploration algorithm.
The performance achieved by FLEX is competitive and its computational cost is low.
arXiv Detail & Related papers (2023-04-26T10:20:55Z) - Physics-Informed Kernel Embeddings: Integrating Prior System Knowledge
with Data-Driven Control [22.549914935697366]
We present a method to incorporate priori knowledge into data-driven control algorithms using kernel embeddings.
Our proposed approach incorporates prior knowledge of the system dynamics as a bias term in the kernel learning problem.
We demonstrate the improved sample efficiency and out-of-sample generalization of our approach over a purely data-driven baseline.
arXiv Detail & Related papers (2023-01-09T18:35:32Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - A Bayesian Detect to Track System for Robust Visual Object Tracking and
Semi-Supervised Model Learning [1.7268829007643391]
We ad-dress problems in a Bayesian tracking and detection framework parameterized by neural network outputs.
We propose a particle filter-based approximate sampling algorithm for tracking object state estimation.
Based on our particle filter inference algorithm, a semi-supervised learn-ing algorithm is utilized for learning tracking network on intermittent labeled frames.
arXiv Detail & Related papers (2022-05-05T00:18:57Z) - Feeling of Presence Maximization: mmWave-Enabled Virtual Reality Meets
Deep Reinforcement Learning [76.46530937296066]
This paper investigates the problem of providing ultra-reliable and energy-efficient virtual reality (VR) experiences for wireless mobile users.
To ensure reliable ultra-high-definition (UHD) video frame delivery to mobile users, a coordinated multipoint (CoMP) transmission technique and millimeter wave (mmWave) communications are exploited.
arXiv Detail & Related papers (2021-06-03T08:35:10Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.