Sample-efficient Model-based Reinforcement Learning for Quantum Control
- URL: http://arxiv.org/abs/2304.09718v2
- Date: Mon, 2 Oct 2023 16:50:53 GMT
- Title: Sample-efficient Model-based Reinforcement Learning for Quantum Control
- Authors: Irtaza Khalid, Carrie A. Weidner, Edmond A. Jonckheere, Sophie G.
Shermer, Frank C. Langbein
- Abstract summary: We propose a model-based reinforcement learning (RL) approach for noisy time-dependent gate optimization.
We show an order of magnitude advantage in the sample complexity of our method over standard model-free RL.
Our algorithm is well suited for controlling partially characterised one and two qubit systems.
- Score: 0.2999888908665658
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a model-based reinforcement learning (RL) approach for noisy
time-dependent gate optimization with improved sample complexity over
model-free RL. Sample complexity is the number of controller interactions with
the physical system. Leveraging an inductive bias, inspired by recent advances
in neural ordinary differential equations (ODEs), we use an auto-differentiable
ODE parametrised by a learnable Hamiltonian ansatz to represent the model
approximating the environment whose time-dependent part, including the control,
is fully known. Control alongside Hamiltonian learning of continuous
time-independent parameters is addressed through interactions with the system.
We demonstrate an order of magnitude advantage in the sample complexity of our
method over standard model-free RL in preparing some standard unitary gates
with closed and open system dynamics, in realistic numerical experiments
incorporating single shot measurements, arbitrary Hilbert space truncations and
uncertainty in Hamiltonian parameters. Also, the learned Hamiltonian can be
leveraged by existing control methods like GRAPE for further gradient-based
optimization with the controllers found by RL as initializations. Our algorithm
that we apply on nitrogen vacancy (NV) centers and transmons in this paper is
well suited for controlling partially characterised one and two qubit systems.
Related papers
- Path-minimizing Latent ODEs for improved extrapolation and inference [0.0]
Latent ODE models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics.
In this paper we exploit this dichotomy by encouraging time-independent latent representations.
By replacing the common variational penalty in latent space with an $ell$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations.
This results in faster training, smaller models, more accurate and long-time extrapolation compared to the baseline ODE models with GRU, RNN, and LSTM/decoders on tests with
arXiv Detail & Related papers (2024-10-11T15:50:01Z) - Random Features Approximation for Control-Affine Systems [6.067043299145924]
We propose two novel classes of nonlinear feature representations which capture control affine structure.
Our methods make use of random features (RF) approximations, inheriting the expressiveness of kernel methods at a lower computational cost.
arXiv Detail & Related papers (2024-06-10T17:54:57Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Sample Complexity of Kernel-Based Q-Learning [11.32718794195643]
We propose a non-parametric Q-learning algorithm which finds an $epsilon$-optimal policy in an arbitrarily large scale discounted MDP.
To the best of our knowledge, this is the first result showing a finite sample complexity under such a general model.
arXiv Detail & Related papers (2023-02-01T19:46:25Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - Continuous-Time Model-Based Reinforcement Learning [4.427447378048202]
We propose a continuous-time MBRL framework based on a novel actor-critic method.
We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems.
arXiv Detail & Related papers (2021-02-09T11:30:19Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand.
We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice.
Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.