Related papers: Improvements to Gradient Descent Methods for Quantum Tensor Network Machine Learning

Improvements to Gradient Descent Methods for Quantum Tensor Network Machine Learning

URL: http://arxiv.org/abs/2203.03366v1
Date: Thu, 3 Mar 2022 19:00:40 GMT
Title: Improvements to Gradient Descent Methods for Quantum Tensor Network Machine Learning
Authors: Fergus Barratt, James Dborin, Lewis Wright
Abstract summary: We introduce a copy node' method that successfully initializes arbitrary tensor networks. We present numerical results that show that the combination of techniques presented here produces quantum inspired tensor network models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tensor networks have demonstrated significant value for machine learning in a myriad of different applications. However, optimizing tensor networks using standard gradient descent has proven to be difficult in practice. Tensor networks suffer from initialization problems resulting in exploding or vanishing gradients and require extensive hyperparameter tuning. Efforts to overcome these problems usually depend on specific network architectures, or ad hoc prescriptions. In this paper we address the problems of initialization and hyperparameter tuning, making it possible to train tensor networks using established machine learning techniques. We introduce a `copy node' method that successfully initializes arbitrary tensor networks, in addition to a gradient based regularization technique for bond dimensions. We present numerical results that show that the combination of techniques presented here produces quantum inspired tensor network models with far fewer parameters, while improving generalization performance.

Related papers

Hybrid quantum tensor networks for aeroelastic applications [1.2932412290302258]
We investigate the application of hybrid quantum tensor networks to aeroelastic problems, harnessing the power of Quantum Machine Learning (QML)<n>Our results showcase the ability of hybrid quantum tensor networks to achieve high accuracy in binary classification.<n>We observe promising performance in regressing discrete variables.
arXiv Detail & Related papers (2025-08-07T09:00:18Z)
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis [5.016205338484259]
The proposed method is more robust to network size variations than the existing method. When applied to Physics-Informed Neural Networks, the method exhibits faster convergence and robustness to variations of the network size.
arXiv Detail & Related papers (2024-10-03T06:30:27Z)
Quick design of feasible tensor networks for constrained combinatorial optimization [1.8775413720750924]
In recent years, tensor networks have been applied to constrained optimization problems for practical applications. One approach is to construct tensor networks with nilpotent-matrix manipulation. The proposed method is expected to facilitate the discovery of feasible tensor networks for constrained optimization problems.
arXiv Detail & Related papers (2024-09-03T08:36:23Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions. We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis. We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z)
Slimmable Networks for Contrastive Self-supervised Learning [69.9454691873866]
Self-supervised learning makes significant progress in pre-training large models, but struggles with small models. We introduce another one-stage solution to obtain pre-trained small models without the need for extra teachers. A slimmable network consists of a full network and several weight-sharing sub-networks, which can be pre-trained once to obtain various networks.
arXiv Detail & Related papers (2022-09-30T15:15:05Z)
Training Thinner and Deeper Neural Networks: Jumpstart Regularization [2.8348950186890467]
We use regularization to prevent neurons from dying or becoming linear. In comparison to conventional training, we obtain neural networks that are thinner, deeper, and - most importantly - more parameter-efficient.
arXiv Detail & Related papers (2022-01-30T12:11:24Z)
Tensor-Train Networks for Learning Predictive Modeling of Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications. We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation. An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z)
Anomaly Detection with Tensor Networks [2.3895981099137535]
We exploit the memory and computational efficiency of tensor networks to learn a linear transformation over a space with a dimension exponential in the number of original features. We produce competitive results on image datasets, despite not exploiting the locality of images.
arXiv Detail & Related papers (2020-06-03T20:41:30Z)
Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks. We propose a feature distortion method (Disout) for addressing the aforementioned problem. The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)
Molecule Property Prediction and Classification with Graph Hypernetworks [113.38181979662288]
We show that the replacement of the underlying networks with hypernetworks leads to a boost in performance. A major difficulty in the application of hypernetworks is their lack of stability. A recent work has tackled the training instability of hypernetworks in the context of error correcting codes.
arXiv Detail & Related papers (2020-02-01T16:44:34Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.