Quantum Adaptive Self-Attention for Quantum Transformer Models
- URL: http://arxiv.org/abs/2504.05336v1
- Date: Sat, 05 Apr 2025 02:52:37 GMT
- Title: Quantum Adaptive Self-Attention for Quantum Transformer Models
- Authors: Chi-Sheng Chen, En-Jui Kuo,
- Abstract summary: We propose Quantum Adaptive Self-Attention (QASA), a novel hybrid architecture that enhances classical Transformer models with a quantum attention mechanism.<n>QASA replaces dot-product attention with a parameterized quantum circuit (PQC) that adaptively captures inter-token relationships in the quantum Hilbert space.<n> Experiments on synthetic time-series tasks demonstrate that QASA achieves faster convergence and superior generalization compared to both standard Transformers and reduced classical variants.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer models have revolutionized sequential learning across various domains, yet their self-attention mechanism incurs quadratic computational cost, posing limitations for real-time and resource-constrained tasks. To address this, we propose Quantum Adaptive Self-Attention (QASA), a novel hybrid architecture that enhances classical Transformer models with a quantum attention mechanism. QASA replaces dot-product attention with a parameterized quantum circuit (PQC) that adaptively captures inter-token relationships in the quantum Hilbert space. Additionally, a residual quantum projection module is introduced before the feedforward network to further refine temporal features. Our design retains classical efficiency in earlier layers while injecting quantum expressiveness in the final encoder block, ensuring compatibility with current NISQ hardware. Experiments on synthetic time-series tasks demonstrate that QASA achieves faster convergence and superior generalization compared to both standard Transformers and reduced classical variants. Preliminary complexity analysis suggests potential quantum advantages in gradient computation, opening new avenues for efficient quantum deep learning models.
Related papers
- Quantum parallel information exchange (QPIE) hybrid network with transfer learning [18.43273756128771]
Quantum machine learning (QML) has emerged as an innovative framework with the potential to uncover complex patterns.<n>We introduce quantum parallel information exchange (QPIE) hybrid network, a new non-sequential hybrid classical quantum model architecture.<n>We develop a dynamic gradient selection method that applies the parameter shift rule on quantum processing units.
arXiv Detail & Related papers (2025-04-05T17:25:26Z) - A Survey of Quantum Transformers: Approaches, Advantages, Challenges, and Future Directions [2.5871385953824855]
Quantum Transformer models represent a significant research direction in quantum machine learning (QML)<n>PQC-based Transformer models are the primary focus of current research.<n>Quantum Linear Algebra (QLA)-based Transformer models rely on future fault-tolerant quantum computing.
arXiv Detail & Related papers (2025-04-04T05:40:18Z) - Toward Large-Scale Distributed Quantum Long Short-Term Memory with Modular Quantum Computers [5.673361333697935]
We introduce a Distributed Quantum Long Short-Term Memory (QLSTM) framework to address scalability challenges on Noisy Intermediate-Scale Quantum (NISQ) devices.<n>QLSTM captures long-range temporal dependencies, while a distributed architecture partitions the underlying Variational Quantum Circuits into smaller, manageable subcircuits.<n>We demonstrate that the distributed QLSTM achieves stable convergence and improved training dynamics compared to classical approaches.
arXiv Detail & Related papers (2025-03-18T10:07:34Z) - Programming Variational Quantum Circuits with Quantum-Train Agent [3.360429911727189]
The Quantum-Train Quantum Fast Weight Programmer (QT-QFWP) framework is proposed, which facilitates the efficient and scalable programming of variational quantum circuits (VQCs)<n>This approach offers a significant advantage over conventional hybrid quantum-classical models by optimizing both quantum and classical parameter management.<n> QT-QFWP outperforms related models in both efficiency and predictive accuracy, providing a pathway toward more practical and cost-effective quantum machine learning applications.
arXiv Detail & Related papers (2024-12-02T06:26:09Z) - Efficient Learning for Linear Properties of Bounded-Gate Quantum Circuits [63.733312560668274]
Given a quantum circuit containing d tunable RZ gates and G-d Clifford gates, can a learner perform purely classical inference to efficiently predict its linear properties?
We prove that the sample complexity scaling linearly in d is necessary and sufficient to achieve a small prediction error, while the corresponding computational complexity may scale exponentially in d.
We devise a kernel-based learning model capable of trading off prediction error and computational complexity, transitioning from exponential to scaling in many practical settings.
arXiv Detail & Related papers (2024-08-22T08:21:28Z) - A Quantum-Classical Collaborative Training Architecture Based on Quantum
State Fidelity [50.387179833629254]
We introduce a collaborative classical-quantum architecture called co-TenQu.
Co-TenQu enhances a classical deep neural network by up to 41.72% in a fair setting.
It outperforms other quantum-based methods by up to 1.9 times and achieves similar accuracy while utilizing 70.59% fewer qubits.
arXiv Detail & Related papers (2024-02-23T14:09:41Z) - Pre-training Tensor-Train Networks Facilitates Machine Learning with Variational Quantum Circuits [70.97518416003358]
Variational quantum circuits (VQCs) hold promise for quantum machine learning on noisy intermediate-scale quantum (NISQ) devices.
While tensor-train networks (TTNs) can enhance VQC representation and generalization, the resulting hybrid model, TTN-VQC, faces optimization challenges due to the Polyak-Lojasiewicz (PL) condition.
To mitigate this challenge, we introduce Pre+TTN-VQC, a pre-trained TTN model combined with a VQC.
arXiv Detail & Related papers (2023-05-18T03:08:18Z) - Synergy Between Quantum Circuits and Tensor Networks: Short-cutting the
Race to Practical Quantum Advantage [43.3054117987806]
We introduce a scalable procedure for harnessing classical computing resources to provide pre-optimized initializations for quantum circuits.
We show this method significantly improves the trainability and performance of PQCs on a variety of problems.
By demonstrating a means of boosting limited quantum resources using classical computers, our approach illustrates the promise of this synergy between quantum and quantum-inspired models in quantum computing.
arXiv Detail & Related papers (2022-08-29T15:24:03Z) - Recent Advances for Quantum Neural Networks in Generative Learning [98.88205308106778]
Quantum generative learning models (QGLMs) may surpass their classical counterparts.
We review the current progress of QGLMs from the perspective of machine learning.
We discuss the potential applications of QGLMs in both conventional machine learning tasks and quantum physics.
arXiv Detail & Related papers (2022-06-07T07:32:57Z) - Simulating the Mott transition on a noisy digital quantum computer via
Cartan-based fast-forwarding circuits [62.73367618671969]
Dynamical mean-field theory (DMFT) maps the local Green's function of the Hubbard model to that of the Anderson impurity model.
Quantum and hybrid quantum-classical algorithms have been proposed to efficiently solve impurity models.
This work presents the first computation of the Mott phase transition using noisy digital quantum hardware.
arXiv Detail & Related papers (2021-12-10T17:32:15Z) - Quantum algorithms for quantum dynamics: A performance study on the
spin-boson model [68.8204255655161]
Quantum algorithms for quantum dynamics simulations are traditionally based on implementing a Trotter-approximation of the time-evolution operator.
variational quantum algorithms have become an indispensable alternative, enabling small-scale simulations on present-day hardware.
We show that, despite providing a clear reduction of quantum gate cost, the variational method in its current implementation is unlikely to lead to a quantum advantage.
arXiv Detail & Related papers (2021-08-09T18:00:05Z) - Tensor Network Quantum Virtual Machine for Simulating Quantum Circuits
at Exascale [57.84751206630535]
We present a modernized version of the Quantum Virtual Machine (TNQVM) which serves as a quantum circuit simulation backend in the e-scale ACCelerator (XACC) framework.
The new version is based on the general purpose, scalable network processing library, ExaTN, and provides multiple quantum circuit simulators.
By combining the portable XACC quantum processors and the scalable ExaTN backend we introduce an end-to-end virtual development environment which can scale from laptops to future exascale platforms.
arXiv Detail & Related papers (2021-04-21T13:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.