Deep tensor networks with matrix product operators
- URL: http://arxiv.org/abs/2209.09098v1
- Date: Fri, 16 Sep 2022 09:09:52 GMT
- Title: Deep tensor networks with matrix product operators
- Authors: Bojan \v{Z}unkovi\v{c}
- Abstract summary: We introduce deep tensor networks, which are exponentially wide neural networks based on the tensor network representation of the weight matrices.
We evaluate the proposed method on the image classification (MNIST, FashionMNIST) and sequence prediction (cellular automata) tasks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce deep tensor networks, which are exponentially wide neural
networks based on the tensor network representation of the weight matrices. We
evaluate the proposed method on the image classification (MNIST, FashionMNIST)
and sequence prediction (cellular automata) tasks. In the image classification
case, deep tensor networks improve our matrix product state baselines and
achieve 0.49% error rate on MNIST and 8.3% error rate on FashionMNIST. In the
sequence prediction case, we demonstrate an exponential improvement in the
number of parameters compared to the one-layer tensor network methods. In both
cases, we discuss the non-uniform and the uniform tensor network models and
show that the latter generalizes well to different input sizes.
Related papers
- Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - Optimising for Interpretability: Convolutional Dynamic Alignment
Networks [108.83345790813445]
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets)
Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns.
CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions.
arXiv Detail & Related papers (2021-09-27T12:39:46Z) - Patch-based medical image segmentation using Quantum Tensor Networks [1.5899411215927988]
We formulate image segmentation in a supervised setting with tensor networks.
The key idea is to first lift the pixels in image patches to exponentially high dimensional feature spaces.
The performance of the proposed model is evaluated on three 2D- and one 3D- biomedical imaging datasets.
arXiv Detail & Related papers (2021-09-15T07:54:05Z) - Segmenting two-dimensional structures with strided tensor networks [1.952097552284465]
We propose a novel formulation of tensor networks for supervised image segmentation.
The proposed model is end-to-end trainable using backpropagation.
The evaluation shows that the strided tensor network yields competitive performance compared to CNN-based models.
arXiv Detail & Related papers (2021-02-13T11:06:34Z) - Tensor-Train Networks for Learning Predictive Modeling of
Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications.
We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation.
An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - T-Basis: a Compact Representation for Neural Networks [89.86997385827055]
We introduce T-Basis, a concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks.
We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops.
arXiv Detail & Related papers (2020-07-13T19:03:22Z) - Anomaly Detection with Tensor Networks [2.3895981099137535]
We exploit the memory and computational efficiency of tensor networks to learn a linear transformation over a space with a dimension exponential in the number of original features.
We produce competitive results on image datasets, despite not exploiting the locality of images.
arXiv Detail & Related papers (2020-06-03T20:41:30Z) - Tensor Networks for Medical Image Classification [0.456877715768796]
We focus on the class of Networks, which has been a work horse for physicists in the last two decades to analyse quantum many-body systems.
We extend the Matrix Product State tensor networks to be useful in medical image analysis tasks.
We show that tensor networks are capable of attaining performance that is comparable to state-of-the-art deep learning methods.
arXiv Detail & Related papers (2020-04-21T15:02:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.