Quantum Large Language Model Fine-Tuning
- URL: http://arxiv.org/abs/2504.08732v1
- Date: Fri, 11 Apr 2025 17:57:35 GMT
- Title: Quantum Large Language Model Fine-Tuning
- Authors: Sang Hyub Kim, Jonathan Mei, Claudio Girotto, Masako Yamada, Martin Roetteler,
- Abstract summary: We introduce a hybrid quantum-classical deep learning architecture for large language model fine-tuning.<n>The classical portion of the architecture is a sentence transformer that is powerful enough to display significant accuracy for complex tasks such as sentiment prediction.<n>We show an overall improvement in prediction accuracy over a comparable classical baseline, with a trend of increasing accuracy with number of qubits.
- Score: 1.118478900782898
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a hybrid quantum-classical deep learning architecture for large language model fine-tuning. The classical portion of the architecture is a sentence transformer that is powerful enough to display significant accuracy for complex tasks such as sentiment prediction. The quantum portion of the architecture consists of parameterized quantum circuits that utilize long-range connections between qubits. We analyze the performance of the hybrid models for various settings of hyperparameters, including the number of qubits, the depth of the quantum circuits, learning rate, number of re-uploading steps, etc. Based on a screening study of main effects, we show an overall improvement in prediction accuracy over a comparable classical baseline, with a trend of increasing accuracy with number of qubits. We observe up to $3.14\%$ improvements in accuracy over classical architectures of comparable model size, within the set of hyperparameters probed in this study. We demonstrate the contribution of each module in our architecture through ablation studies. Our studies are based on finite shot-counts and include simulations based on noisy quantum gates.
Related papers
- An Efficient Quantum Classifier Based on Hamiltonian Representations [50.467930253994155]
Quantum machine learning (QML) is a discipline that seeks to transfer the advantages of quantum computing to data-driven tasks.
We propose an efficient approach that circumvents the costs associated with data encoding by mapping inputs to a finite set of Pauli strings.
We evaluate our approach on text and image classification tasks, against well-established classical and quantum models.
arXiv Detail & Related papers (2025-04-13T11:49:53Z) - Near-Term Spin-Qubit Architecture Design via Multipartite Maximally-Entangled States [1.589509357008938]
We introduce four metrics which ascertain the quality of genuine multipartite quantum entanglement, along with circuit-level fidelity measures.<n>We devise simulations which combine expected hardware characteristics of spin-qubit devices with appropriate compilation techniques.<n>We find that sparsely-connected spin-qubit lattices can approach comparable values of our metrics to those of the most highly-connected device architecture.
arXiv Detail & Related papers (2024-12-17T12:55:40Z) - Efficient Learning for Linear Properties of Bounded-Gate Quantum Circuits [63.733312560668274]
Given a quantum circuit containing d tunable RZ gates and G-d Clifford gates, can a learner perform purely classical inference to efficiently predict its linear properties?
We prove that the sample complexity scaling linearly in d is necessary and sufficient to achieve a small prediction error, while the corresponding computational complexity may scale exponentially in d.
We devise a kernel-based learning model capable of trading off prediction error and computational complexity, transitioning from exponential to scaling in many practical settings.
arXiv Detail & Related papers (2024-08-22T08:21:28Z) - Quantum Vision Transformers for Quark-Gluon Classification [3.350407101925898]
We introduce a hybrid quantum-classical vision transformer architecture, notable for its integration of variational quantum circuits.
We evaluate our method by applying the model to multi-detector jet images from CMS Open Data.
arXiv Detail & Related papers (2024-05-16T17:45:54Z) - Disentangling Quantum and Classical Contributions in Hybrid Quantum
Machine Learning Architectures [4.646930308096446]
Hybrid transfer learning solutions have been developed, merging pre-trained classical models with quantum circuits.
It remains unclear how much each component -- classical and quantum -- contributes to the model's results.
We propose a novel hybrid architecture: instead of utilizing a pre-trained network for compression, we employ an autoencoder to derive a compressed version of the input data.
arXiv Detail & Related papers (2023-11-09T18:13:50Z) - Pre-training Tensor-Train Networks Facilitates Machine Learning with Variational Quantum Circuits [70.97518416003358]
Variational quantum circuits (VQCs) hold promise for quantum machine learning on noisy intermediate-scale quantum (NISQ) devices.
While tensor-train networks (TTNs) can enhance VQC representation and generalization, the resulting hybrid model, TTN-VQC, faces optimization challenges due to the Polyak-Lojasiewicz (PL) condition.
To mitigate this challenge, we introduce Pre+TTN-VQC, a pre-trained TTN model combined with a VQC.
arXiv Detail & Related papers (2023-05-18T03:08:18Z) - Majorization-based benchmark of the complexity of quantum processors [105.54048699217668]
We numerically simulate and characterize the operation of various quantum processors.
We identify and assess quantum complexity by comparing the performance of each device against benchmark lines.
We find that the majorization-based benchmark holds as long as the circuits' output states have, on average, high purity.
arXiv Detail & Related papers (2023-04-10T23:01:10Z) - A Framework for Demonstrating Practical Quantum Advantage: Racing
Quantum against Classical Generative Models [62.997667081978825]
We build over a proposed framework for evaluating the generalization performance of generative models.
We establish the first comparative race towards practical quantum advantage (PQA) between classical and quantum generative models.
Our results suggest that QCBMs are more efficient in the data-limited regime than the other state-of-the-art classical generative models.
arXiv Detail & Related papers (2023-03-27T22:48:28Z) - A performance characterization of quantum generative models [35.974070202997176]
We compare quantum circuits used for quantum generative modeling.
We learn the underlying probability distribution of the data sets via two popular training methods.
We empirically find that a variant of the discrete architecture, which learns the copula of the probability distribution, outperforms all other methods.
arXiv Detail & Related papers (2023-01-23T11:00:29Z) - The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for
Deep Quantum Machine Learning [52.77024349608834]
Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing.
Key issue is how to address the inherent non-linearity of classical deep learning.
We introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning.
arXiv Detail & Related papers (2022-12-22T16:06:24Z) - Copula-based Risk Aggregation with Trapped Ion Quantum Computers [1.541403735141431]
Copulas are mathematical tools for modeling joint probability distributions.
Recent finding that copulas can be expressed as maximally entangled quantum states has revealed a promising approach to practical quantum advantages.
We study the training of QCBMs with different levels of precision and circuit design on a simulator and a state-of-the-art trapped ion quantum computer.
arXiv Detail & Related papers (2022-06-23T18:39:30Z) - Once Quantization-Aware Training: High Performance Extremely Low-bit
Architecture Search [112.05977301976613]
We propose to combine Network Architecture Search methods with quantization to enjoy the merits of the two sides.
We first propose the joint training of architecture and quantization with a shared step size to acquire a large number of quantized models.
Then a bit-inheritance scheme is introduced to transfer the quantized models to the lower bit, which further reduces the time cost and improves the quantization accuracy.
arXiv Detail & Related papers (2020-10-09T03:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.