Related papers: Machines of finite depth: towards a formalization of neural networks

Machines of finite depth: towards a formalization of neural networks

URL: http://arxiv.org/abs/2204.12786v1
Date: Wed, 27 Apr 2022 09:17:15 GMT
Title: Machines of finite depth: towards a formalization of neural networks
Authors: Pietro Vertechi and Mattia G. Bergomi
Abstract summary: We provide a unifying framework where artificial neural networks and their architectures can be formally described as particular cases of a general mathematical construction--machines of finite depth. We prove this statement theoretically and practically, via a unified implementation that generalizes several classical architectures--dense, convolutional, and recurrent neural networks with a rich shortcut structure--and their respective backpropagation rules.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We provide a unifying framework where artificial neural networks and their architectures can be formally described as particular cases of a general mathematical construction--machines of finite depth. Unlike neural networks, machines have a precise definition, from which several properties follow naturally. Machines of finite depth are modular (they can be combined), efficiently computable and differentiable. The backward pass of a machine is again a machine and can be computed without overhead using the same procedure as the forward pass. We prove this statement theoretically and practically, via a unified implementation that generalizes several classical architectures--dense, convolutional, and recurrent neural networks with a rich shortcut structure--and their respective backpropagation rules.

Related papers

Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet) ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z)
Gaussian Process Surrogate Models for Neural Networks [6.8304779077042515]
In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque. We construct a class of surrogate models for neural networks using Gaussian processes. We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems.
arXiv Detail & Related papers (2022-08-11T20:17:02Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants. Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z)
Modeling Structure with Undirected Neural Networks [20.506232306308977]
We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order. We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
arXiv Detail & Related papers (2022-02-08T10:06:51Z)
Unified Field Theory for Deep and Recurrent Neural Networks [56.735884560668985]
We present a unified and systematic derivation of the mean-field theory for both recurrent and deep networks. We find that convergence towards the mean-field theory is typically slower for recurrent networks than for deep networks. Our method exposes that Gaussian processes are but the lowest order of a systematic expansion in $1/n$.
arXiv Detail & Related papers (2021-12-10T15:06:11Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Reframing Neural Networks: Deep Structure in Overcomplete Representations [41.84502123663809]
We introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames. We quantify structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design.
arXiv Detail & Related papers (2021-03-10T01:15:14Z)
Reservoir Memory Machines as Neural Computers [70.5993855765376]
Differentiable neural computers extend artificial neural networks with an explicit memory without interference. We achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently.
arXiv Detail & Related papers (2020-09-14T12:01:30Z)
Parametric machines: a fresh approach to architecture search [0.0]
We show how simple machines can be combined into more complex ones. We explore finite- and infinite-depth machines, which generalize neural networks and neural ordinary differential equations.
arXiv Detail & Related papers (2020-07-06T14:27:06Z)
On the computational power and complexity of Spiking Neural Networks [0.0]
We introduce spiking neural networks as a machine model where---in contrast to the familiar Turing machine---information and the manipulation thereof are co-located in the machine. We introduce canonical problems, define hierarchies of complexity classes and provide some first completeness results.
arXiv Detail & Related papers (2020-01-23T10:40:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.