Large Language Models Can Help Mitigate Barren Plateaus
- URL: http://arxiv.org/abs/2502.13166v1
- Date: Mon, 17 Feb 2025 05:57:15 GMT
- Title: Large Language Models Can Help Mitigate Barren Plateaus
- Authors: Jun Zhuang, Chaowen Guan,
- Abstract summary: Quantum Neural Networks (QNNs) have emerged as a promising approach for various applications, yet their training is often hindered by barren plateaus (BPs)
We propose a new Large Language Model (LLM)-driven search framework, AdaInit, that iteratively searches for optimal initial parameters of QNNs to maximize gradient variance and therefore mitigate BPs.
- Score: 2.384873896423002
- License:
- Abstract: In the era of noisy intermediate-scale quantum (NISQ) computing, Quantum Neural Networks (QNNs) have emerged as a promising approach for various applications, yet their training is often hindered by barren plateaus (BPs), where gradient variance vanishes exponentially as the model size increases. To address this challenge, we propose a new Large Language Model (LLM)-driven search framework, AdaInit, that iteratively searches for optimal initial parameters of QNNs to maximize gradient variance and therefore mitigate BPs. Unlike conventional one-time initialization methods, AdaInit dynamically refines QNN's initialization using LLMs with adaptive prompting. Theoretical analysis of the Expected Improvement (EI) proves a supremum for the search, ensuring this process can eventually identify the optimal initial parameter of the QNN. Extensive experiments across four public datasets demonstrate that AdaInit significantly enhances QNN's trainability compared to classic initialization methods, validating its effectiveness in mitigating BPs.
Related papers
- Q-MAML: Quantum Model-Agnostic Meta-Learning for Variational Quantum Algorithms [4.525216077859531]
We introduce a new framework for optimizing parameterized quantum circuits (PQCs) that employs a classical, inspired by Model-Agnostic Meta-Learning (MAML) technique.
Our framework features a classical neural network, called Learner, which interacts with a PQC using the output of Learner as an initial parameter.
In the adaptation phase, the framework requires only a few PQC updates to converge to a more accurate value, while the learner remains unchanged.
arXiv Detail & Related papers (2025-01-10T12:07:00Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets)
Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z) - Challenges of variational quantum optimization with measurement shot noise [0.0]
We study the scaling of the quantum resources to reach a fixed success probability as the problem size increases.
Our results suggest that hybrid quantum-classical algorithms should possibly avoid a brute force classical outer loop.
arXiv Detail & Related papers (2023-07-31T18:01:15Z) - AskewSGD : An Annealed interval-constrained Optimisation method to train
Quantized Neural Networks [12.229154524476405]
We develop a new algorithm, Annealed Skewed SGD - AskewSGD - for training deep neural networks (DNNs) with quantized weights.
Unlike algorithms with active sets and feasible directions, AskewSGD avoids projections or optimization under the entire feasible set.
Experimental results show that the AskewSGD algorithm performs better than or on par with state of the art methods in classical benchmarks.
arXiv Detail & Related papers (2022-11-07T18:13:44Z) - LAWS: Look Around and Warm-Start Natural Gradient Descent for Quantum
Neural Networks [11.844238544360149]
Vari quantum algorithms (VQAs) have recently received significant attention due to their promising performance in Noisy Intermediate-Scale Quantum computers (NISQ)
VQAs run on parameterized quantum circuits (PQC) with randomlyational parameters are characterized by barren plateaus (BP) where the gradient vanishes exponentially in the number of qubits.
In this paper, we first quantum natural gradient (QNG), which is one of the most popular algorithms used in VQA, from the classical first-order point of optimization.
Then, we proposed a underlineAround underline
arXiv Detail & Related papers (2022-05-05T14:16:40Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural
Networks [0.0]
We propose a new pruning method called Pruning for Quantization (PfQ) which removes the filters that disturb the fine-tuning of the DNN.
Experiments using well-known models and datasets confirmed that the proposed method achieves higher performance with a similar model size.
arXiv Detail & Related papers (2020-11-13T04:12:54Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - Optimistic Exploration even with a Pessimistic Initialisation [57.41327865257504]
Optimistic initialisation is an effective strategy for efficient exploration in reinforcement learning (RL)
In particular, in scenarios with only positive rewards, Q-values are initialised at their lowest possible values.
We propose a simple count-based augmentation to pessimistically initialised Q-values that separates the source of optimism from the neural network.
arXiv Detail & Related papers (2020-02-26T17:15:53Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.