Interaction Decompositions for Tensor Network Regression
- URL: http://arxiv.org/abs/2208.06029v1
- Date: Thu, 11 Aug 2022 20:17:27 GMT
- Title: Interaction Decompositions for Tensor Network Regression
- Authors: Ian Convy and K. Birgitta Whaley
- Abstract summary: We show how to assess the relative importance of different regressors as a function of their degree.
We introduce a new type of tensor network model that is explicitly trained on only a small subset of interaction degrees.
This suggests that standard tensor network models utilize their regressors in an inefficient manner, with the lower degree terms vastly underutilized.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well known that tensor network regression models operate on an
exponentially large feature space, but questions remain as to how effectively
they are able to utilize this space. Using the polynomial featurization from
Novikov et al., we propose the interaction decomposition as a tool that can
assess the relative importance of different regressors as a function of their
polynomial degree. We apply this decomposition to tensor ring and tree tensor
network models trained on the MNIST and Fashion MNIST datasets, and find that
up to 75% of interaction degrees are contributing meaningfully to these models.
We also introduce a new type of tensor network model that is explicitly trained
on only a small subset of interaction degrees, and find that these models are
able to match or even outperform the full models using only a fraction of the
exponential feature space. This suggests that standard tensor network models
utilize their polynomial regressors in an inefficient manner, with the lower
degree terms being vastly under-utilized.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Quantized Fourier and Polynomial Features for more Expressive Tensor
Network Models [9.18287948559108]
We exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network.
We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts.
arXiv Detail & Related papers (2023-09-11T13:18:19Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - Low-Rank Tensor Function Representation for Multi-Dimensional Data
Recovery [52.21846313876592]
Low-rank tensor function representation (LRTFR) can continuously represent data beyond meshgrid with infinite resolution.
We develop two fundamental concepts for tensor functions, i.e., the tensor function rank and low-rank tensor function factorization.
Our method substantiates the superiority and versatility of our method as compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-01T04:00:38Z) - Lower and Upper Bounds on the VC-Dimension of Tensor Network Models [8.997952791113232]
Network methods have been a key ingredient of advances in condensed matter physics.
They can be used to efficiently learn linear models in exponentially large feature spaces.
In this work, we derive upper and lower bounds on the VC dimension and pseudo-dimension of a large class of tensor network models.
arXiv Detail & Related papers (2021-06-22T14:39:25Z) - Tensor-Train Networks for Learning Predictive Modeling of
Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications.
We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation.
An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z) - Low-Rank and Sparse Enhanced Tucker Decomposition for Tensor Completion [3.498620439731324]
We introduce a unified low-rank and sparse enhanced Tucker decomposition model for tensor completion.
Our model possesses a sparse regularization term to promote a sparse core tensor, which is beneficial for tensor data compression.
It is remarkable that our model is able to deal with different types of real-world data sets, since it exploits the potential periodicity and inherent correlation properties appeared in tensors.
arXiv Detail & Related papers (2020-10-01T12:45:39Z) - Anomaly Detection with Tensor Networks [2.3895981099137535]
We exploit the memory and computational efficiency of tensor networks to learn a linear transformation over a space with a dimension exponential in the number of original features.
We produce competitive results on image datasets, despite not exploiting the locality of images.
arXiv Detail & Related papers (2020-06-03T20:41:30Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.