Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention
- URL: http://arxiv.org/abs/2403.14753v2
- Date: Wed, 05 Feb 2025 16:56:04 GMT
- Title: Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention
- Authors: Ethan N. Evans, Matthew Cook, Zachary P. Bradshaw, Margarite L. LaBorde,
- Abstract summary: We present a variational quantum circuit architecture named Self-Attention Sequential Quantum Transformer Channel (SASQuaT)
Our approach leverages recent insights from kernel-based operator learning in the context of predicting vision transformer network using simple gate operations and a set of multi-dimensional quantum Fourier transforms.
To validate our approach, we consider image classification tasks in simulation and with hardware, where with only 9 qubits and a handful of parameters we are able to simultaneously embed and classify a grayscale image of handwritten digits with high accuracy.
- Score: 0.464982780843177
- License:
- Abstract: The recent exploding growth in size of state-of-the-art machine learning models highlights a well-known issue where exponential parameter growth, which has grown to trillions as in the case of the Generative Pre-trained Transformer (GPT), leads to training time and memory requirements which limit their advancement in the near term. The predominant models use the so-called transformer network and have a large field of applicability, including predicting text and images, classification, and even predicting solutions to the dynamics of physical systems. Here we present a variational quantum circuit architecture named Self-Attention Sequential Quantum Transformer Channel (SASQuaTCh), which builds networks of qubits that perform analogous operations of the transformer network, namely the keystone self-attention operation, and leads to an exponential improvement in parameter complexity and run-time complexity over its classical counterpart. Our approach leverages recent insights from kernel-based operator learning in the context of predicting spatiotemporal systems to represent deep layers of a vision transformer network using simple gate operations and a set of multi-dimensional quantum Fourier transforms. To validate our approach, we consider image classification tasks in simulation and with hardware, where with only 9 qubits and a handful of parameters we are able to simultaneously embed and classify a grayscale image of handwritten digits with high accuracy.
Related papers
- UDiTQC: U-Net-Style Diffusion Transformer for Quantum Circuit Synthesis [13.380226276791818]
Current diffusion model approaches based on U-Net architectures, while promising, encounter challenges related to computational efficiency and modeling global context.
We propose UDiT,a novel U-Net-style Diffusion Transformer architecture, which combines U-Net's strengths in multi-scale feature extraction with the Transformer's ability to model global context.
arXiv Detail & Related papers (2025-01-24T15:15:50Z) - Quantum Convolutional Neural Network with Flexible Stride [7.362858964229726]
We propose a novel quantum convolutional neural network algorithm.
It can flexibly adjust the stride to accommodate different tasks.
It can achieve exponential acceleration of data scale in less memory compared with its classical counterpart.
arXiv Detail & Related papers (2024-12-01T02:37:06Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - Todyformer: Towards Holistic Dynamic Graph Transformers with
Structure-Aware Tokenization [6.799413002613627]
Todyformer is a novel Transformer-based neural network tailored for dynamic graphs.
It unifies the local encoding capacity of Message-Passing Neural Networks (MPNNs) with the global encoding of Transformers.
We show that Todyformer consistently outperforms the state-of-the-art methods for downstream tasks.
arXiv Detail & Related papers (2024-02-02T23:05:30Z) - Transformer variational wave functions for frustrated quantum spin
systems [0.0]
We propose an adaptation of the ViT architecture with complex parameters to define a new class of variational neural-network states.
The success of the ViT wave function relies on mixing both local and global operations.
arXiv Detail & Related papers (2022-11-10T11:56:44Z) - Vision Transformer with Convolutions Architecture Search [72.70461709267497]
We propose an architecture search method-Vision Transformer with Convolutions Architecture Search (VTCAS)
The high-performance backbone network searched by VTCAS introduces the desirable features of convolutional neural networks into the Transformer architecture.
It enhances the robustness of the neural network for object recognition, especially in the low illumination indoor scene.
arXiv Detail & Related papers (2022-03-20T02:59:51Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - Generation of High-Resolution Handwritten Digits with an Ion-Trap
Quantum Computer [55.41644538483948]
We implement a quantum-circuit based generative model to learn and sample the prior distribution of a Generative Adversarial Network.
We train this hybrid algorithm on an ion-trap device based on $171$Yb$+$ ion qubits to generate high-quality images.
arXiv Detail & Related papers (2020-12-07T18:51:28Z) - Recurrent Quantum Neural Networks [7.6146285961466]
Recurrent neural networks are the foundation of many sequence-to-sequence models in machine learning.
We construct a quantum recurrent neural network (QRNN) with demonstrable performance on non-trivial tasks.
We evaluate the QRNN on MNIST classification, both by feeding the QRNN each image pixel-by-pixel; and by utilising modern data augmentation as preprocessing step.
arXiv Detail & Related papers (2020-06-25T17:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.