Related papers: Avoiding local minima in variational quantum eigensolvers with the natural gradient optimizer

Avoiding local minima in variational quantum eigensolvers with the natural gradient optimizer

URL: http://arxiv.org/abs/2004.14666v2
Date: Tue, 19 May 2020 09:29:32 GMT
Title: Avoiding local minima in variational quantum eigensolvers with the natural gradient optimizer
Authors: David Wierichs, Christian Gogolin, Michael Kastoryano
Abstract summary: We compare the BFGS, ADAM and Natural Gradient Descent (NatGrad) in the context of Variational Quantum Eigensolvers (VQEs) We analyze their performance on the QAOA ansatz for the Transverse Field Ising Model (TFIM) as well as on overparametrized circuits with the ability to break the symmetry of the Hamiltonian.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We compare the BFGS optimizer, ADAM and Natural Gradient Descent (NatGrad) in the context of Variational Quantum Eigensolvers (VQEs). We systematically analyze their performance on the QAOA ansatz for the Transverse Field Ising Model (TFIM) as well as on overparametrized circuits with the ability to break the symmetry of the Hamiltonian. The BFGS algorithm is frequently unable to find a global minimum for systems beyond about 20 spins and ADAM easily gets trapped in local minima. On the other hand, NatGrad shows stable performance on all considered system sizes, albeit at a significantly higher cost per epoch. In sharp contrast to most classical gradient based learning, the performance of all optimizers is found to decrease upon seemingly benign overparametrization of the ansatz class, with BFGS and ADAM failing more often and more severely than NatGrad. Additional tests for the Heisenberg XXZ model corroborate the accuracy problems of BFGS in high dimensions, but they reveal some shortcomings of NatGrad as well. Our results suggest that great care needs to be taken in the choice of gradient based optimizers and the parametrization for VQEs.

Related papers

Achieving Constraints in Neural Networks: A Stochastic Augmented Lagrangian Approach [49.1574468325115]
Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting. We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem. We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
arXiv Detail & Related papers (2023-10-25T13:55:35Z)
Nonconvex Stochastic Bregman Proximal Gradient Method for Nonconvex Composite Problems [9.202586157819693]
gradient methods for non composite objective functions typically rely on the Lipschitz smoothness of the differentiable part. We propose a better approximation model that handles non-Lipschitz gradient in non objectives. We show it achieves optimal robustness in terms of step selection sensitivity.
arXiv Detail & Related papers (2023-06-26T08:54:46Z)
An Empirical Comparison of Optimizers for Quantum Machine Learning with SPSA-based Gradients [1.2532932830320982]
We introduce a novel approach that uses approximated gradient from SPSA in combination with state-of-the-art classical gradients. We demonstrate numerically that this outperforms both standard SPSA and the parameter-shift rule in terms of convergence rate and absolute error in simple regression tasks.
arXiv Detail & Related papers (2023-04-27T15:19:49Z)
Using Differential Evolution to avoid local minima in Variational Quantum Algorithms [0.0]
Variational Quantum Algorithms (VQAs) are among the most promising NISQ-era algorithms for harnessing quantum computing. Our goal in this paper is to study alternative optimization methods that can avoid or reduce the effect of local minima and barren plateau problems.
arXiv Detail & Related papers (2023-03-21T20:31:06Z)
SketchySGD: Reliable Stochastic Optimization via Randomized Curvature Estimates [19.420605210427635]
SketchySGD improves upon existing gradient methods in machine learning by using randomized low-rank approximations to the subsampled Hessian. We show theoretically that SketchySGD with a fixed stepsize converges linearly to a small ball around the optimum. In the ill-conditioned setting we show SketchySGD converges at a faster rate than SGD for least-squares problems.
arXiv Detail & Related papers (2022-11-16T01:05:41Z)
Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems. In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z)
Predicting parameters for the Quantum Approximate Optimization Algorithm for MAX-CUT from the infinite-size limit [0.05076419064097732]
We present an explicit algorithm to evaluate the performance of QAOA on MAX-CUT applied to random Erdos-Renyi graphs of expected degree $d$. This analysis yields an explicit mapping between QAOA parameters for MAX-CUT on Erdos-Renyi graphs, and the Sherrington-Kirkpatrick model.
arXiv Detail & Related papers (2021-10-20T17:58:53Z)
Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models. We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs. Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error. Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z)
Taming GANs with Lookahead-Minmax [63.90038365274479]
Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient. Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels.
arXiv Detail & Related papers (2020-06-25T17:13:23Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.