Distributive Pre-Training of Generative Modeling Using Matrix-Product
States
- URL: http://arxiv.org/abs/2306.14787v1
- Date: Mon, 26 Jun 2023 15:46:08 GMT
- Title: Distributive Pre-Training of Generative Modeling Using Matrix-Product
States
- Authors: Sheng-Hsuan Lin, Olivier Kuijpers, Sebastian Peterhansl, and Frank
Pollmann
- Abstract summary: We consider an alternative training scheme utilizing basic tensor network operations, e.g., summation and compression.
The training algorithm is based on compressing the superposition state constructed from all the training data in product state representation.
We benchmark the algorithm on the MNIST dataset and show reasonable results for generating new images and classification tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tensor networks have recently found applications in machine learning for both
supervised learning and unsupervised learning. The most common approaches for
training these models are gradient descent methods. In this work, we consider
an alternative training scheme utilizing basic tensor network operations, e.g.,
summation and compression. The training algorithm is based on compressing the
superposition state constructed from all the training data in product state
representation. The algorithm could be parallelized easily and only iterates
through the dataset once. Hence, it serves as a pre-training algorithm. We
benchmark the algorithm on the MNIST dataset and show reasonable results for
generating new images and classification tasks. Furthermore, we provide an
interpretation of the algorithm as a compressed quantum kernel density
estimation for the probability amplitude of input data.
Related papers
- Discrete Neural Algorithmic Reasoning [18.497863598167257]
We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states.
trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm.
arXiv Detail & Related papers (2024-02-18T16:03:04Z) - Randomized Polar Codes for Anytime Distributed Machine Learning [66.46612460837147]
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations.
We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery.
We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization.
arXiv Detail & Related papers (2023-09-01T18:02:04Z) - Automated Sizing and Training of Efficient Deep Autoencoders using
Second Order Algorithms [0.46040036610482665]
We propose a multi-step training method for generalized linear classifiers.
validation error is minimized by pruning of unnecessary inputs.
desired outputs are improved via a method similar to the Ho-Kashyap rule.
arXiv Detail & Related papers (2023-08-11T16:48:31Z) - FastHebb: Scaling Hebbian Training of Deep Neural Networks to ImageNet
Level [7.410940271545853]
We present FastHebb, an efficient and scalable solution for Hebbian learning.
FastHebb outperforms previous solutions by up to 50 times in terms of training speed.
For the first time, we are able to bring Hebbian algorithms to ImageNet scale.
arXiv Detail & Related papers (2022-07-07T09:04:55Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Learning with Subset Stacking [0.40964539027092906]
We propose a new regression algorithm that learns from a set of input-output pairs.
We call this algorithm LEarning with Subset Stacking'' or LESS, due to its resemblance to the method of stacking regressors.
arXiv Detail & Related papers (2021-12-12T14:33:49Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - A Low Complexity Decentralized Neural Net with Centralized Equivalence
using Layer-wise Learning [49.15799302636519]
We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers)
In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns.
We show that it is possible to achieve equivalent learning performance as if the data is available in a single place.
arXiv Detail & Related papers (2020-09-29T13:08:12Z) - On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points.
We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z) - A Multi-Scale Tensor Network Architecture for Classification and
Regression [0.0]
We present an algorithm for supervised learning using tensor networks.
We employ a step of preprocessing the data by coarse-graining through a sequence of wavelet transformations.
We show how fine-graining through the network may be used to initialize models with access to finer-scale features.
arXiv Detail & Related papers (2020-01-22T21:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.