Convolutions Through the Lens of Tensor Networks
- URL: http://arxiv.org/abs/2307.02275v1
- Date: Wed, 5 Jul 2023 13:19:41 GMT
- Title: Convolutions Through the Lens of Tensor Networks
- Authors: Felix Dangel
- Abstract summary: We provide a new perspective onto convolutions through tensor networks (TNs)
TNs allow reasoning about the underlying tensor multiplications by drawing diagrams, manipulating them to perform function transformations, sub-tensor access, and fusion.
We demonstrate this expressive power by deriving the diagrams of various autodiff operations and popular approximations of second-order information.
- Score: 2.6397379133308214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite their simple intuition, convolutions are more tedious to analyze than
dense layers, which complicates the generalization of theoretical and
algorithmic ideas. We provide a new perspective onto convolutions through
tensor networks (TNs) which allow reasoning about the underlying tensor
multiplications by drawing diagrams, and manipulating them to perform function
transformations, sub-tensor access, and fusion. We demonstrate this expressive
power by deriving the diagrams of various autodiff operations and popular
approximations of second-order information with full hyper-parameter support,
batching, channel groups, and generalization to arbitrary convolution
dimensions. Further, we provide convolution-specific transformations based on
the connectivity pattern which allow to re-wire and simplify diagrams before
evaluation. Finally, we probe computational performance, relying on established
machinery for efficient TN contraction. Our TN implementation speeds up a
recently-proposed KFAC variant up to 4.5x and enables new hardware-efficient
tensor dropout for approximate backpropagation.
Related papers
- conv_einsum: A Framework for Representation and Fast Evaluation of
Multilinear Operations in Convolutional Tensorial Neural Networks [28.416123889998243]
We develop a framework for representing tensorial convolution layers as einsum-like strings and a meta-algorithm conv_einsum which is able to evaluate these strings in a FLOPs-minimizing manner.
arXiv Detail & Related papers (2024-01-07T04:30:12Z) - Scalable Neural Network Kernels [22.299704296356836]
We introduce scalable neural network kernels (SNNKs), capable of approximating regular feedforward layers (FFLs)
We also introduce the neural network bundling process that applies SNNKs to compactify deep neural network architectures.
Our mechanism provides up to 5x reduction in the number of trainable parameters, while maintaining competitive accuracy.
arXiv Detail & Related papers (2023-10-20T02:12:56Z) - B-cos Alignment for Inherently Interpretable CNNs and Vision
Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training.
We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations.
We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z) - OneDConv: Generalized Convolution For Transform-Invariant Representation [76.15687106423859]
We propose a novel generalized one dimension convolutional operator (OneDConv)
It dynamically transforms the convolution kernels based on the input features in a computationally and parametrically efficient manner.
It improves the robustness and generalization of convolution without sacrificing the performance on common images.
arXiv Detail & Related papers (2022-01-15T07:44:44Z) - Revisiting Transformation Invariant Geometric Deep Learning: Are Initial
Representations All You Need? [80.86819657126041]
We show that transformation-invariant and distance-preserving initial representations are sufficient to achieve transformation invariance.
Specifically, we realize transformation-invariant and distance-preserving initial point representations by modifying multi-dimensional scaling.
We prove that TinvNN can strictly guarantee transformation invariance, being general and flexible enough to be combined with the existing neural networks.
arXiv Detail & Related papers (2021-12-23T03:52:33Z) - A tensor network representation of path integrals: Implementation and
analysis [0.0]
We introduce a novel tensor network-based decomposition of path integral simulations involving Feynman-Vernon influence functional.
The finite temporarily non-local interactions introduced by the influence functional can be captured very efficiently using matrix product state representation.
The flexibility of the AP-TNPI framework makes it a promising new addition to the family of path integral methods for non-equilibrium quantum dynamics.
arXiv Detail & Related papers (2021-06-23T16:41:54Z) - Self-Supervised Graph Representation Learning via Topology
Transformations [61.870882736758624]
We present the Topology Transformation Equivariant Representation learning, a general paradigm of self-supervised learning for node representations of graph data.
In experiments, we apply the proposed model to the downstream node and graph classification tasks, and results show that the proposed method outperforms the state-of-the-art unsupervised approaches.
arXiv Detail & Related papers (2021-05-25T06:11:03Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Adaptive Learning of Tensor Network Structures [6.407946291544721]
We leverage the TN formalism to develop a generic and efficient adaptive algorithm to learn the structure and the parameters of a TN from data.
Our algorithm can adaptively identify TN structures with small number of parameters that effectively optimize any differentiable objective function.
arXiv Detail & Related papers (2020-08-12T16:41:56Z) - Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.