QAMA: Scalable Quantum Annealing Multi-Head Attention Operator for Deep Learning
- URL: http://arxiv.org/abs/2504.11083v2
- Date: Sun, 12 Oct 2025 03:17:01 GMT
- Title: QAMA: Scalable Quantum Annealing Multi-Head Attention Operator for Deep Learning
- Authors: Peng Du, Jinjing Shi, Wenxuan Wang, Yin Ma, Kai Wen, Xuelong Li,
- Abstract summary: Quantum Annealing Multi-Head Attention (QAMA) is proposed, a novel drop-in operator that reformulates attention as an energy-based Hamiltonian optimization problem.<n>In this framework, token interactions are encoded into binary quadratic terms, and quantum annealing is employed to search for low-energy configurations.<n> Empirically, evaluation on both natural language and vision benchmarks shows that, across tasks, accuracy deviates by at most 2.7 points from standard multi-head attention.
- Score: 48.12231190677108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attention mechanisms underpin modern deep learning, while the quadratic time and space complexity limit scalability for long sequences. To address this, Quantum Annealing Multi-Head Attention (QAMA) is proposed, a novel drop-in operator that reformulates attention as an energy-based Hamiltonian optimization problem. In this framework, token interactions are encoded into binary quadratic terms, and quantum annealing is employed to search for low-energy configurations that correspond to effective attention patterns. Unlike classical sparse or approximate attention methods that rely on hand-crafted heuristics, QAMA allows sparsity structures to emerge naturally from the optimization process. Theoretically, computational complexity is analysed through single-spin flip dynamics, providing time to solution runtime bounds that depend on the spectral properties of the annealing Hamiltonian. Empirically, evaluation on both natural language and vision benchmarks shows that, across tasks, accuracy deviates by at most 2.7 points from standard multi-head attention, while requiring only linear qubits in sequence length. Visualizations further reveal that the Hamiltonian penalty terms induce meaningful and interpretable sparsity across heads. Finally, deployment on a coherent Ising machine validates the feasibility of running QAMA on real quantum hardware, showing tangible inference-time reductions compared with classical implementations. These results highlight QAMA as a pioneering and scalable step toward integrating quantum optimization devices into deep neural architectures, providing a seamlessly integrable and hardware-compatible alternative to conventional attention mechanisms. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
Related papers
- Exponential Quantum Speedup on Structured Hard Instances of Maximum Independent Set [0.0]
We identify a family of classically hard maximum independent set (MIS) instances, and design and analyze a non-stoquastic adiabatic quantum optimization algorithm.<n>The algorithm runs in time and achieves an exponential speedup over both transverse-field quantum and state-of-the-art classical solvers.<n>This identifies a distinctive quantum mechanism underlying the speedup and explains why no efficient classical analogue is likely to exist.
arXiv Detail & Related papers (2026-01-25T04:18:35Z) - Hybrid Quantum-Classical Selective State Space Artificial Intelligence [1.4896509623302832]
We propose a Hybrid Quantum Classical selection mechanism for the Mamba architecture for temporal sequence classification problems.<n>Our approach leverages Variational Quantum Circuits (VQCs) as quantum gating modules that both enhance feature extraction and improve suppression of irrelevant information.<n>We analyze how introducing quantum subroutines into large language models (LLMs) impacts their generalization capability, expressivity, and parameter efficiency.
arXiv Detail & Related papers (2025-11-11T15:26:57Z) - Quantum Graph Attention Network: A Novel Quantum Multi-Head Attention Mechanism for Graph Learning [0.0]
Quantum Graph Attention Network (QGAT) is a hybrid graph neural network that integrates variational quantum circuits into the attention mechanism.<n>We show QGAT's effectiveness in capturing complex structural dependencies and improved generalization in inductive scenarios.<n>Experiments confirm that quantum embedding enhances robustness against feature and structural noise, suggesting advantages in handling real-world noisy data.
arXiv Detail & Related papers (2025-08-25T03:25:48Z) - VQC-MLPNet: An Unconventional Hybrid Quantum-Classical Architecture for Scalable and Robust Quantum Machine Learning [60.996803677584424]
Variational Quantum Circuits (VQCs) offer a novel pathway for quantum machine learning.<n>Their practical application is hindered by inherent limitations such as constrained linear expressivity, optimization challenges, and acute sensitivity to quantum hardware noise.<n>This work introduces VQC-MLPNet, a scalable and robust hybrid quantum-classical architecture designed to overcome these obstacles.
arXiv Detail & Related papers (2025-06-12T01:38:15Z) - RhoDARTS: Differentiable Quantum Architecture Search with Density Matrix Simulations [48.670876200492415]
Variational Quantum Algorithms (VQAs) are a promising approach for leveraging powerful Noisy Intermediate-Scale Quantum (NISQ) computers.<n>We propose $rho$DARTS, a differentiable Quantum Architecture Search (QAS) algorithm that models the search process as the evolution of a quantum mixed state.
arXiv Detail & Related papers (2025-06-04T08:30:35Z) - Quantum Adaptive Self-Attention for Quantum Transformer Models [0.0]
We propose Quantum Adaptive Self-Attention (QASA), a novel hybrid architecture that enhances classical Transformer models with a quantum attention mechanism.
QASA replaces dot-product attention with a parameterized quantum circuit (PQC) that adaptively captures inter-token relationships in the quantum Hilbert space.
Experiments on synthetic time-series tasks demonstrate that QASA achieves faster convergence and superior generalization compared to both standard Transformers and reduced classical variants.
arXiv Detail & Related papers (2025-04-05T02:52:37Z) - Quantum-Enhanced Attention Mechanism in NLP: A Hybrid Classical-Quantum Approach [0.0]
We present a hybrid classical-quantum Transformer model that integrates a quantum-enhanced attention mechanism into the standard classical architecture.<n>We demonstrate the effectiveness of this approach across diverse NLP benchmarks, showing improvements in both efficiency and representational capacity.
arXiv Detail & Related papers (2025-01-26T18:29:06Z) - Leveraging Pre-Trained Neural Networks to Enhance Machine Learning with Variational Quantum Circuits [48.33631905972908]
We introduce an innovative approach that utilizes pre-trained neural networks to enhance Variational Quantum Circuits (VQC)
This technique effectively separates approximation error from qubit count and removes the need for restrictive conditions.
Our results extend to applications such as human genome analysis, demonstrating the broad applicability of our approach.
arXiv Detail & Related papers (2024-11-13T12:03:39Z) - QCircuitBench: A Large-Scale Dataset for Benchmarking Quantum Algorithm Design [63.02824918725805]
Quantum computing is recognized for the significant speedup it offers over classical computing through quantum algorithms.<n>QCircuitBench is the first benchmark dataset designed to evaluate AI's capability in designing and implementing quantum algorithms.
arXiv Detail & Related papers (2024-10-10T14:24:30Z) - Hybrid Quantum-Classical Clustering for Preparing a Prior Distribution of Eigenspectrum [10.950807972899575]
We consider preparing the prior distribution and circuits for the eigenspectrum of time-independent Hamiltonians.
The proposed algorithm unfolds in three strategic steps: Hamiltonian transformation, parameter representation, and classical clustering.
The algorithm is showcased through applications to the 1D Heisenberg system and the LiH molecular system.
arXiv Detail & Related papers (2024-06-29T14:21:55Z) - Quantum-Train: Rethinking Hybrid Quantum-Classical Machine Learning in the Model Compression Perspective [7.7063925534143705]
We introduce the Quantum-Train(QT) framework, a novel approach that integrates quantum computing with machine learning algorithms.
QT achieves remarkable results by employing a quantum neural network alongside a classical mapping model.
arXiv Detail & Related papers (2024-05-18T14:35:57Z) - Practical Few-Atom Quantum Reservoir Computing [0.0]
Quantum Reservoir Computing (QRC) harnesses quantum systems to tackle intricate computational problems with exceptional efficiency and minimized energy usage.<n>This paper presents a QRC framework that utilizes a minimalistic quantum reservoir, consisting of only a few two-level atoms within an optical cavity.
arXiv Detail & Related papers (2024-05-08T04:14:31Z) - Graph Learning for Parameter Prediction of Quantum Approximate
Optimization Algorithm [14.554010382366302]
Quantum Approximate Optimization (QAOA) stands out for its potential to efficiently solve the Max-Cut problem.
We use Graph Neural Networks (GNN) as a warm-start technique to optimize QAOA, using GNN as a warm-start technique.
Our findings show GNN's potential in improving QAOA performance, opening new avenues for hybrid quantum-classical approaches in quantum computing.
arXiv Detail & Related papers (2024-03-05T20:23:25Z) - Quantum algorithms: A survey of applications and end-to-end complexities [88.57261102552016]
The anticipated applications of quantum computers span across science and industry.<n>We present a survey of several potential application areas of quantum algorithms.<n>We outline the challenges and opportunities in each area in an "end-to-end" fashion.
arXiv Detail & Related papers (2023-10-04T17:53:55Z) - Pre-training Tensor-Train Networks Facilitates Machine Learning with Variational Quantum Circuits [70.97518416003358]
Variational quantum circuits (VQCs) hold promise for quantum machine learning on noisy intermediate-scale quantum (NISQ) devices.
While tensor-train networks (TTNs) can enhance VQC representation and generalization, the resulting hybrid model, TTN-VQC, faces optimization challenges due to the Polyak-Lojasiewicz (PL) condition.
To mitigate this challenge, we introduce Pre+TTN-VQC, a pre-trained TTN model combined with a VQC.
arXiv Detail & Related papers (2023-05-18T03:08:18Z) - Quantum Annealing for Single Image Super-Resolution [86.69338893753886]
We propose a quantum computing-based algorithm to solve the single image super-resolution (SISR) problem.
The proposed AQC-based algorithm is demonstrated to achieve improved speed-up over a classical analog while maintaining comparable SISR accuracy.
arXiv Detail & Related papers (2023-04-18T11:57:15Z) - A Framework for Demonstrating Practical Quantum Advantage: Racing
Quantum against Classical Generative Models [62.997667081978825]
We build over a proposed framework for evaluating the generalization performance of generative models.
We establish the first comparative race towards practical quantum advantage (PQA) between classical and quantum generative models.
Our results suggest that QCBMs are more efficient in the data-limited regime than the other state-of-the-art classical generative models.
arXiv Detail & Related papers (2023-03-27T22:48:28Z) - MAQA: A Quantum Framework for Supervised Learning [2.064612766965483]
This work proposes a universal, efficient framework that can reproduce the output of a plethora of classical supervised machine learning algorithms.
The proposed framework is named Multiple Aggregator Quantum Algorithm (MAQA) due to its capability to combine multiple and diverse functions.
As a second meaningful addition, we discuss the adoption of the proposed framework as hybrid quantum-classical and fault-tolerant quantum algorithm.
arXiv Detail & Related papers (2023-03-20T11:18:22Z) - Synergy Between Quantum Circuits and Tensor Networks: Short-cutting the
Race to Practical Quantum Advantage [43.3054117987806]
We introduce a scalable procedure for harnessing classical computing resources to provide pre-optimized initializations for quantum circuits.
We show this method significantly improves the trainability and performance of PQCs on a variety of problems.
By demonstrating a means of boosting limited quantum resources using classical computers, our approach illustrates the promise of this synergy between quantum and quantum-inspired models in quantum computing.
arXiv Detail & Related papers (2022-08-29T15:24:03Z) - QSAN: A Near-term Achievable Quantum Self-Attention Network [73.15524926159702]
Self-Attention Mechanism (SAM) is good at capturing the internal connections of features.
A novel Quantum Self-Attention Network (QSAN) is proposed for image classification tasks on near-term quantum devices.
arXiv Detail & Related papers (2022-07-14T12:22:51Z) - Adiabatic Quantum Computing for Multi Object Tracking [170.8716555363907]
Multi-Object Tracking (MOT) is most often approached in the tracking-by-detection paradigm, where object detections are associated through time.
As these optimization problems are often NP-hard, they can only be solved exactly for small instances on current hardware.
We show that our approach is competitive compared with state-of-the-art optimization-based approaches, even when using of-the-shelf integer programming solvers.
arXiv Detail & Related papers (2022-02-17T18:59:20Z) - Demonstration of multi-qubit entanglement and algorithms on a
programmable neutral atom quantum computer [0.0]
Neutral atom hyperfine qubits provide inherent scalability due to their identical characteristics, long coherence times, and ability to be trapped in dense multi-dimensional arrays.
We demonstrate several quantum algorithms on a programmable gate model neutral atom quantum computer in an architecture based on individual addressing of single atoms with tightly focused optical beams scanned across a two-dimensional array of qubits.
arXiv Detail & Related papers (2021-12-29T15:02:43Z) - Realization of arbitrary doubly-controlled quantum phase gates [62.997667081978825]
We introduce a high-fidelity gate set inspired by a proposal for near-term quantum advantage in optimization problems.
By orchestrating coherent, multi-level control over three transmon qutrits, we synthesize a family of deterministic, continuous-angle quantum phase gates acting in the natural three-qubit computational basis.
arXiv Detail & Related papers (2021-08-03T17:49:09Z) - Machine Learning Framework for Quantum Sampling of Highly-Constrained,
Continuous Optimization Problems [101.18253437732933]
We develop a generic, machine learning-based framework for mapping continuous-space inverse design problems into surrogate unconstrained binary optimization problems.
We showcase the framework's performance on two inverse design problems by optimizing thermal emitter topologies for thermophotovoltaic applications and (ii) diffractive meta-gratings for highly efficient beam steering.
arXiv Detail & Related papers (2021-05-06T02:22:23Z) - Information Scrambling in Computationally Complex Quantum Circuits [56.22772134614514]
We experimentally investigate the dynamics of quantum scrambling on a 53-qubit quantum processor.
We show that while operator spreading is captured by an efficient classical model, operator entanglement requires exponentially scaled computational resources to simulate.
arXiv Detail & Related papers (2021-01-21T22:18:49Z) - Low depth mechanisms for quantum optimization [0.25295633594332334]
We focus on developing a language and tools connected with kinetic energy on a graph for understanding the physical mechanisms of success and failure to guide algorithmic improvement.
This is connected to effects from wavefunction confinement, phase randomization, and shadow defects lurking in the objective far away from the ideal solution.
arXiv Detail & Related papers (2020-08-19T18:16:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.