Avoiding local minima in variational quantum eigensolvers with the
natural gradient optimizer
- URL: http://arxiv.org/abs/2004.14666v2
- Date: Tue, 19 May 2020 09:29:32 GMT
- Title: Avoiding local minima in variational quantum eigensolvers with the
natural gradient optimizer
- Authors: David Wierichs, Christian Gogolin, Michael Kastoryano
- Abstract summary: We compare the BFGS, ADAM and Natural Gradient Descent (NatGrad) in the context of Variational Quantum Eigensolvers (VQEs)
We analyze their performance on the QAOA ansatz for the Transverse Field Ising Model (TFIM) as well as on overparametrized circuits with the ability to break the symmetry of the Hamiltonian.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We compare the BFGS optimizer, ADAM and Natural Gradient Descent (NatGrad) in
the context of Variational Quantum Eigensolvers (VQEs). We systematically
analyze their performance on the QAOA ansatz for the Transverse Field Ising
Model (TFIM) as well as on overparametrized circuits with the ability to break
the symmetry of the Hamiltonian. The BFGS algorithm is frequently unable to
find a global minimum for systems beyond about 20 spins and ADAM easily gets
trapped in local minima. On the other hand, NatGrad shows stable performance on
all considered system sizes, albeit at a significantly higher cost per epoch.
In sharp contrast to most classical gradient based learning, the performance of
all optimizers is found to decrease upon seemingly benign overparametrization
of the ansatz class, with BFGS and ADAM failing more often and more severely
than NatGrad. Additional tests for the Heisenberg XXZ model corroborate the
accuracy problems of BFGS in high dimensions, but they reveal some shortcomings
of NatGrad as well. Our results suggest that great care needs to be taken in
the choice of gradient based optimizers and the parametrization for VQEs.
Related papers
- Achieving Constraints in Neural Networks: A Stochastic Augmented
Lagrangian Approach [49.1574468325115]
Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting.
We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem.
We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
arXiv Detail & Related papers (2023-10-25T13:55:35Z) - Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning [9.202586157819693]
quadratic methods for minimizing robustness for non composite objective functions typically rely on the Lipschitz smoothness of the differentiable part.
We propose a family of Bregman (SBPG) methods that only adaptivity.
MSBPG, a momentum-based variant, enhances convergence sensitivity by relaxing the minibatch size requirement.
arXiv Detail & Related papers (2023-06-26T08:54:46Z) - An Empirical Comparison of Optimizers for Quantum Machine Learning with
SPSA-based Gradients [1.2532932830320982]
We introduce a novel approach that uses approximated gradient from SPSA in combination with state-of-the-art classical gradients.
We demonstrate numerically that this outperforms both standard SPSA and the parameter-shift rule in terms of convergence rate and absolute error in simple regression tasks.
arXiv Detail & Related papers (2023-04-27T15:19:49Z) - Using Differential Evolution to avoid local minima in Variational
Quantum Algorithms [0.0]
Variational Quantum Algorithms (VQAs) are among the most promising NISQ-era algorithms for harnessing quantum computing.
Our goal in this paper is to study alternative optimization methods that can avoid or reduce the effect of local minima and barren plateau problems.
arXiv Detail & Related papers (2023-03-21T20:31:06Z) - SketchySGD: Reliable Stochastic Optimization via Randomized Curvature
Estimates [19.420605210427635]
SketchySGD improves upon existing gradient methods in machine learning by using randomized low-rank approximations to the subsampled Hessian.
We show theoretically that SketchySGD with a fixed stepsize converges linearly to a small ball around the optimum.
In the ill-conditioned setting we show SketchySGD converges at a faster rate than SGD for least-squares problems.
arXiv Detail & Related papers (2022-11-16T01:05:41Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Predicting parameters for the Quantum Approximate Optimization Algorithm
for MAX-CUT from the infinite-size limit [0.05076419064097732]
We present an explicit algorithm to evaluate the performance of QAOA on MAX-CUT applied to random Erdos-Renyi graphs of expected degree $d$.
This analysis yields an explicit mapping between QAOA parameters for MAX-CUT on Erdos-Renyi graphs, and the Sherrington-Kirkpatrick model.
arXiv Detail & Related papers (2021-10-20T17:58:53Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Taming GANs with Lookahead-Minmax [63.90038365274479]
Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient.
Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels.
arXiv Detail & Related papers (2020-06-25T17:13:23Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.