Entangling Machine Learning with Quantum Tensor Networks
- URL: http://arxiv.org/abs/2403.12969v1
- Date: Tue, 9 Jan 2024 00:07:36 GMT
- Title: Entangling Machine Learning with Quantum Tensor Networks
- Authors: Constantijn van der Poel, Dan Zhao,
- Abstract summary: This paper examines the use of tensor networks, which can efficiently represent high-dimensional quantum states, in language modeling.
We will abstract the problem down to modeling Motzkin spin chains, which exhibit long-range correlations reminiscent of those found in language.
We find that the tensor models reach near perfect classifying ability, and maintain a stable level of performance as the number of valid training examples is decreased.
- Score: 13.609946378021016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper examines the use of tensor networks, which can efficiently represent high-dimensional quantum states, in language modeling. It is a distillation and continuation of the work done in (van der Poel, 2023). To do so, we will abstract the problem down to modeling Motzkin spin chains, which exhibit long-range correlations reminiscent of those found in language. The Matrix Product State (MPS), also known as the tensor train, has a bond dimension which scales as the length of the sequence it models. To combat this, we use the factored core MPS, whose bond dimension scales sub-linearly. We find that the tensor models reach near perfect classifying ability, and maintain a stable level of performance as the number of valid training examples is decreased.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Neural Metamorphosis [72.88137795439407]
This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks.
NeuMeta directly learns the continuous weight manifold of neural networks.
It sustains full-size performance even at a 75% compression rate.
arXiv Detail & Related papers (2024-10-10T14:49:58Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Quantized Fourier and Polynomial Features for more Expressive Tensor
Network Models [9.18287948559108]
We exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network.
We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts.
arXiv Detail & Related papers (2023-09-11T13:18:19Z) - Low-Rank Tensor Function Representation for Multi-Dimensional Data
Recovery [52.21846313876592]
Low-rank tensor function representation (LRTFR) can continuously represent data beyond meshgrid with infinite resolution.
We develop two fundamental concepts for tensor functions, i.e., the tensor function rank and low-rank tensor function factorization.
Our method substantiates the superiority and versatility of our method as compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-01T04:00:38Z) - Interaction Decompositions for Tensor Network Regression [0.0]
We show how to assess the relative importance of different regressors as a function of their degree.
We introduce a new type of tensor network model that is explicitly trained on only a small subset of interaction degrees.
This suggests that standard tensor network models utilize their regressors in an inefficient manner, with the lower degree terms vastly underutilized.
arXiv Detail & Related papers (2022-08-11T20:17:27Z) - Error Analysis of Tensor-Train Cross Approximation [88.83467216606778]
We provide accuracy guarantees in terms of the entire tensor for both exact and noisy measurements.
Results are verified by numerical experiments, and may have important implications for the usefulness of cross approximations for high-order tensors.
arXiv Detail & Related papers (2022-07-09T19:33:59Z) - Patch-based medical image segmentation using Quantum Tensor Networks [1.5899411215927988]
We formulate image segmentation in a supervised setting with tensor networks.
The key idea is to first lift the pixels in image patches to exponentially high dimensional feature spaces.
The performance of the proposed model is evaluated on three 2D- and one 3D- biomedical imaging datasets.
arXiv Detail & Related papers (2021-09-15T07:54:05Z) - Lower and Upper Bounds on the VC-Dimension of Tensor Network Models [8.997952791113232]
Network methods have been a key ingredient of advances in condensed matter physics.
They can be used to efficiently learn linear models in exponentially large feature spaces.
In this work, we derive upper and lower bounds on the VC dimension and pseudo-dimension of a large class of tensor network models.
arXiv Detail & Related papers (2021-06-22T14:39:25Z) - Anomaly Detection with Tensor Networks [2.3895981099137535]
We exploit the memory and computational efficiency of tensor networks to learn a linear transformation over a space with a dimension exponential in the number of original features.
We produce competitive results on image datasets, despite not exploiting the locality of images.
arXiv Detail & Related papers (2020-06-03T20:41:30Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.