Extraction Propagation
- URL: http://arxiv.org/abs/2402.15883v3
- Date: Wed, 09 Oct 2024 23:25:27 GMT
- Title: Extraction Propagation
- Authors: Stephen Pasteris, Chris Hicks, Vasilios Mavroudis,
- Abstract summary: We develop a novel neural network architecture called Extraction propagation.
Extraction propagation works by training, in parallel, many small neural networks which interact with one another.
- Score: 4.368185344922342
- License:
- Abstract: We consider the problem of learning to map large instances, such as sequences and images, to outputs. Since training one large neural network end to end with backpropagation is plagued by vanishing gradients and degradation, we develop a novel neural network architecture called Extraction propagation, which works by training, in parallel, many small neural networks which interact with one another. We note that the performance of Extraction propagation is only conjectured as we have yet to implement it. We do, however, back the algorithm with some theory. A previous version of this paper was entitled "Fusion encoder networks" and detailed a slightly different architecture.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - A max-affine spline approximation of neural networks using the Legendre
transform of a convex-concave representation [0.3007949058551534]
This work presents a novel algorithm for transforming a neural network into a spline representation.
The only constraint is that the function be bounded and possess a well-define second derivative.
It can also be performed over the whole network rather than on each layer independently.
arXiv Detail & Related papers (2023-07-16T17:01:20Z) - Self-Expanding Neural Networks [24.812671965904727]
We introduce a natural gradient based approach which intuitively expands both the width and depth of a neural network.
We prove an upper bound on the rate'' at which neurons are added, and a computationally cheap lower bound on the expansion score.
We illustrate the benefits of such Self-Expanding Neural Networks with full connectivity and convolutions in both classification and regression problems.
arXiv Detail & Related papers (2023-07-10T12:49:59Z) - Centered Self-Attention Layers [89.21791761168032]
The self-attention mechanism in transformers and the message-passing mechanism in graph neural networks are repeatedly applied.
We show that this application inevitably leads to oversmoothing, i.e., to similar representations at the deeper layers.
We present a correction term to the aggregating operator of these mechanisms.
arXiv Detail & Related papers (2023-06-02T15:19:08Z) - Convolutional Learning on Multigraphs [153.20329791008095]
We develop convolutional information processing on multigraphs and introduce convolutional multigraph neural networks (MGNNs)
To capture the complex dynamics of information diffusion within and across each of the multigraph's classes of edges, we formalize a convolutional signal processing model.
We develop a multigraph learning architecture, including a sampling procedure to reduce computational complexity.
The introduced architecture is applied towards optimal wireless resource allocation and a hate speech localization task, offering improved performance over traditional graph neural networks.
arXiv Detail & Related papers (2022-09-23T00:33:04Z) - Quiver neural networks [5.076419064097734]
We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures.
Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows.
arXiv Detail & Related papers (2022-07-26T09:42:45Z) - Learning on Arbitrary Graph Topologies via Predictive Coding [38.761663028090204]
We show how predictive coding can be used to perform inference and learning on arbitrary graph topologies.
We experimentally show how this formulation, called PC graphs, can be used to flexibly perform different tasks with the same network.
arXiv Detail & Related papers (2022-01-31T12:43:22Z) - Predify: Augmenting deep neural networks with brain-inspired predictive
coding dynamics [0.5284812806199193]
We take inspiration from a popular framework in neuroscience: 'predictive coding'
We show that implementing this strategy into two popular networks, VGG16 and EfficientNetB0, improves their robustness against various corruptions.
arXiv Detail & Related papers (2021-06-04T22:48:13Z) - Learning compositional functions via multiplicative weight updates [97.9457834009578]
We show that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.
We show that Madam can train state of the art neural network architectures without learning rate tuning.
arXiv Detail & Related papers (2020-06-25T17:05:19Z) - Neural Sparse Representation for Image Restoration [116.72107034624344]
Inspired by the robustness and efficiency of sparse coding based image restoration models, we investigate the sparsity of neurons in deep networks.
Our method structurally enforces sparsity constraints upon hidden neurons.
Experiments show that sparse representation is crucial in deep neural networks for multiple image restoration tasks.
arXiv Detail & Related papers (2020-06-08T05:15:17Z) - Geometrically Principled Connections in Graph Neural Networks [66.51286736506658]
We argue geometry should remain the primary driving force behind innovation in the emerging field of geometric deep learning.
We relate graph neural networks to widely successful computer graphics and data approximation models: radial basis functions (RBFs)
We introduce affine skip connections, a novel building block formed by combining a fully connected layer with any graph convolution operator.
arXiv Detail & Related papers (2020-04-06T13:25:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.