Quantum Transformer: Accelerating model inference via quantum linear algebra
- URL: http://arxiv.org/abs/2402.16714v3
- Date: Wed, 29 Oct 2025 14:48:21 GMT
- Title: Quantum Transformer: Accelerating model inference via quantum linear algebra
- Authors: Naixu Guo, Zhan Yu, Matthew Choi, Yizhan Han, Aman Agrawal, Kouhei Nakaji, Alán Aspuru-Guzik, Patrick Rebentrost,
- Abstract summary: We develop quantum subroutines to construct the building blocks in the transformer.<n>We show how to efficiently implement the Hadamard product on quantum computers.<n>We find that the matrix norm of the input sequence plays a dominant role in the quantum complexity.
- Score: 8.777641765295398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Powerful generative artificial intelligence from large language models (LLMs) harnesses extensive computational resources for inference. In this work, we investigate the transformer architecture, a key component of these models, under the lens of fault-tolerant quantum computing. We develop quantum subroutines to construct the building blocks in the transformer, including the self-attention, residual connection with layer normalization, and feed-forward network. As an important subroutine, we show how to efficiently implement the Hadamard product and element-wise functions of matrices on quantum computers. Our algorithm prepares an amplitude encoding of the transformer output, which can be measured for prediction or use in the next layer. We find that the matrix norm of the input sequence plays a dominant role in the quantum complexity. With numerical experiments on open-source LLMs, including for bio-informatics applications, we demonstrate the potential of a quantum speedup for transformer inference in practical regimes.
Related papers
- Vectorized Attention with Learnable Encoding for Quantum Transformer [0.6766416093990318]
We propose the Vectorized Quantum Transformer (VQT), a model that supports ideal masked attention matrix computation.<n>Our noise intermediate-scale quantum friendly VQT approach unlocks a novel architecture for end-to-end machine learning in quantum computing.
arXiv Detail & Related papers (2025-08-25T20:33:14Z) - Quantum-Efficient Convolution through Sparse Matrix Encoding and Low-Depth Inner Product Circuits [0.0]
We present a resource-efficient quantum algorithm that reformulates the convolution product as a structured matrix multiplication.<n>We construct a quantum framework wherein sparse input patches are prepared using optimized key-value QRAM state encoding.<n>Our architecture supports batched convolution across multiple filters using a generalized SWAP circuit.
arXiv Detail & Related papers (2025-07-25T20:08:12Z) - A Survey of Quantum Transformers: Architectures, Challenges and Outlooks [82.4736481748099]
Quantum Transformers integrate the representational power of classical Transformers with the computational advantages of quantum computing.<n>Since 2022, research in this area has rapidly expanded, giving rise to diverse technical paradigms and early applications.<n>This paper presents the first comprehensive, systematic, and in-depth survey of quantum Transformer models.
arXiv Detail & Related papers (2025-04-04T05:40:18Z) - HQViT: Hybrid Quantum Vision Transformer for Image Classification [48.72766405978677]
We propose a Hybrid Quantum Vision Transformer (HQViT) to accelerate model training while enhancing model performance.
HQViT introduces whole-image processing with amplitude encoding to better preserve global image information without additional positional encoding.
Experiments across various computer vision datasets demonstrate that HQViT outperforms existing models, achieving a maximum improvement of up to $10.9%$ (on the MNIST 10-classification task) over the state of the art.
arXiv Detail & Related papers (2025-04-03T16:13:34Z) - A Hybrid Transformer Architecture with a Quantized Self-Attention Mechanism Applied to Molecular Generation [0.0]
We propose a hybrid quantum-classical self-attention mechanism as part of a transformer decoder.
We show that the time complexity of the query-key dot product is reduced from $mathcalO(n2 d)$ in a classical model to $mathcalO(n2 d)$ in our quantum model.
This work provides a promising avenue for quantum-enhanced natural language processing (NLP)
arXiv Detail & Related papers (2025-02-26T15:15:01Z) - Transformers are Efficient Compilers, Provably [11.459397066286822]
Transformer-based large language models (LLMs) have demonstrated surprisingly robust performance across a wide range of language-related tasks.
In this paper, we take the first steps towards a formal investigation of using transformers as compilers from an expressive power perspective.
We introduce a representative programming language, Mini-Husky, which encapsulates key features of modern C-like languages.
arXiv Detail & Related papers (2024-10-07T20:31:13Z) - Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized.
We find that these random transformers can perform a wide range of meaningful algorithmic tasks.
Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z) - Efficient Learning for Linear Properties of Bounded-Gate Quantum Circuits [62.46800898243033]
Recent progress in quantum learning theory prompts a question: can linear properties of a large-qubit circuit be efficiently learned from measurement data generated by varying classical inputs?<n>We prove that the sample complexity scaling linearly in $d$ is required to achieve a small prediction error, while the corresponding computational complexity may scale exponentially in d.<n>We propose a kernel-based method leveraging classical shadows and truncated trigonometric expansions, enabling a controllable trade-off between prediction accuracy and computational overhead.
arXiv Detail & Related papers (2024-08-22T08:21:28Z) - Universal Matrix Multiplication on Quantum Computer [12.14644252552695]
matrix multiplication plays a crucial role in pattern recognition and machine learning.<n>This paper introduces an innovative and practical approach to universal quantum matrix multiplication.<n>We construct the basic universal quantum matrix multiplication and extend it to the Strassen algorithm.
arXiv Detail & Related papers (2024-08-06T10:25:02Z) - Transformers meet Neural Algorithmic Reasoners [16.5785372289558]
We propose a novel approach that combines the Transformer's language understanding with the robustness of graph neural network (GNN)-based neural algorithmic reasoners (NARs)
We evaluate our resulting TransNAR model on CLRS-Text, the text-based version of the CLRS-30 benchmark, and demonstrate significant gains over Transformer-only models for algorithmic reasoning.
arXiv Detail & Related papers (2024-06-13T16:42:06Z) - Quixer: A Quantum Transformer Model [3.140679149492808]
We present Quixer: a novel quantum transformer model.
Quixer operates by preparing a superposition of tokens and applying a trainable non-linear transformation to this mix.
We show that its parameterised components can be substituted with fixed structures to yield new classes of quantum transformers.
arXiv Detail & Related papers (2024-06-06T17:52:05Z) - Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention [0.464982780843177]
We show that quantum circuits can efficiently express a self-attention mechanism through the perspective of kernel-based operator learning.
In this work, we are able to represent deep layers of a vision transformer network using simple gate operations and a set of multi-dimensional quantum Fourier transforms.
We analyze our novel variational quantum circuit, which we call Self-Attention Sequential Quantum Transformer Channel (SASTQuaCh), and demonstrate its utility on simplified classification problems.
arXiv Detail & Related papers (2024-03-21T18:00:04Z) - GPT on a Quantum Computer [0.0]
Large Language Models (LLMs) have transformed how we interact with and understand the capabilities of Artificial Intelligence (AI)
This paper outlines a framework for implementing the foundational Transformer architecture -- integral to ChatGPT -- within a quantum computing paradigm.
We aspire to open new avenues for research in Quantum Machine Learning (QML) and contribute to the ongoing evolution of AI technologies.
arXiv Detail & Related papers (2024-03-14T14:07:31Z) - AlgoFormer: An Efficient Transformer Framework with Algorithmic Structures [80.28359222380733]
We design a novel transformer framework, dubbed AlgoFormer, to empower transformers with algorithmic capabilities.
In particular, inspired by the structure of human-designed learning algorithms, our transformer framework consists of a pre-transformer that is responsible for task preprocessing.
Some theoretical and empirical results are presented to show that the designed transformer has the potential to perform algorithm representation and learning.
arXiv Detail & Related papers (2024-02-21T07:07:54Z) - Quantum circuit synthesis with diffusion models [0.6554326244334868]
We use generative machine learning models, specifically denoising diffusion models (DMs), to facilitate this transformation.
We steer the model to produce desired quantum operations within gate-based quantum circuits.
We envision DMs as pivotal in quantum circuit synthesis, enhancing both practical applications but also insights into theoretical quantum computation.
arXiv Detail & Related papers (2023-11-03T17:17:08Z) - On the Convergence of Encoder-only Shallow Transformers [62.639819460956176]
We build the global convergence theory of encoder-only shallow Transformers under a realistic setting.
Our results can pave the way for a better understanding of modern Transformers, particularly on training dynamics.
arXiv Detail & Related papers (2023-11-02T20:03:05Z) - Learning Transformer Programs [78.9509560355733]
We introduce a procedure for training Transformers that are mechanistically interpretable by design.
Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization.
The Transformer Programs can automatically find reasonable solutions, performing on par with standard Transformers of comparable size.
arXiv Detail & Related papers (2023-06-01T20:27:01Z) - Ground state preparation and energy estimation on early fault-tolerant
quantum computers via quantum eigenvalue transformation of unitary matrices [3.1952399274829775]
We develop a tool called quantum eigenvalue transformation of unitary matrices with reals (QET-U)
This leads to a simple quantum algorithm that outperforms all previous algorithms with a comparable circuit structure for estimating the ground state energy.
We demonstrate the performance of the algorithm using IBM Qiskit for the transverse field Ising model.
arXiv Detail & Related papers (2022-04-12T17:11:40Z) - A quantum processor based on coherent transport of entangled atom arrays [44.62475518267084]
We show a quantum processor with dynamic, nonlocal connectivity, in which entangled qubits are coherently transported in a highly parallel manner.
We use this architecture to realize programmable generation of entangled graph states such as cluster states and a 7-qubit Steane code state.
arXiv Detail & Related papers (2021-12-07T19:00:00Z) - Sentence Bottleneck Autoencoders from Transformer Language Models [53.350633961266375]
We build a sentence-level autoencoder from a pretrained, frozen transformer language model.
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder.
We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
arXiv Detail & Related papers (2021-08-31T19:39:55Z) - Quantum-tailored machine-learning characterization of a superconducting
qubit [50.591267188664666]
We develop an approach to characterize the dynamics of a quantum device and learn device parameters.
This approach outperforms physics-agnostic recurrent neural networks trained on numerically generated and experimental data.
This demonstration shows how leveraging domain knowledge improves the accuracy and efficiency of this characterization task.
arXiv Detail & Related papers (2021-06-24T15:58:57Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z) - Tensor Network Quantum Virtual Machine for Simulating Quantum Circuits
at Exascale [57.84751206630535]
We present a modernized version of the Quantum Virtual Machine (TNQVM) which serves as a quantum circuit simulation backend in the e-scale ACCelerator (XACC) framework.
The new version is based on the general purpose, scalable network processing library, ExaTN, and provides multiple quantum circuit simulators.
By combining the portable XACC quantum processors and the scalable ExaTN backend we introduce an end-to-end virtual development environment which can scale from laptops to future exascale platforms.
arXiv Detail & Related papers (2021-04-21T13:26:42Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - The Hintons in your Neural Network: a Quantum Field Theory View of Deep
Learning [84.33745072274942]
We show how to represent linear and non-linear layers as unitary quantum gates, and interpret the fundamental excitations of the quantum model as particles.
On top of opening a new perspective and techniques for studying neural networks, the quantum formulation is well suited for optical quantum computing.
arXiv Detail & Related papers (2021-03-08T17:24:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.