Related papers: Non-Euclidean Universal Approximation

Non-Euclidean Universal Approximation

URL: http://arxiv.org/abs/2006.02341v3
Date: Sat, 7 Nov 2020 15:40:00 GMT
Title: Non-Euclidean Universal Approximation
Authors: Anastasis Kratsios, Eugene Bilokopytov
Abstract summary: Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks. We present general conditions describing feature and readout maps that preserve an architecture's ability to approximate any continuous functions uniformly on compacts.
Score: 4.18804572788063
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks. However, the impact of such changes on architecture's approximation capabilities is largely not understood. We present general conditions describing feature and readout maps that preserve an architecture's ability to approximate any continuous functions uniformly on compacts. As an application, we show that if an architecture is capable of universal approximation, then modifying its final layer to produce binary values creates a new architecture capable of deterministically approximating any classifier. In particular, we obtain guarantees for deep CNNs and deep feed-forward networks. Our results also have consequences within the scope of geometric deep learning. Specifically, when the input and output spaces are Cartan-Hadamard manifolds, we obtain geometrically meaningful feature and readout maps satisfying our criteria. Consequently, commonly used non-Euclidean regression models between spaces of symmetric positive definite matrices are extended to universal DNNs. The same result allows us to show that the hyperbolic feed-forward networks, used for hierarchical learning, are universal. Our result is also used to show that the common practice of randomizing all but the last two layers of a DNN produces a universal family of functions with probability one. We also provide conditions on a DNN's first (resp. last) few layer's connections and activation function which guarantee that these layers can have a width equal to the input (resp. output) space's dimension while not negatively affecting the architecture's approximation capabilities.

Related papers

Information-Theoretic Generalization Bounds for Deep Neural Networks [22.87479366196215]
Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications. This work aims to capture the effect and benefits of depth for supervised learning via information-theoretic generalization bounds.
arXiv Detail & Related papers (2024-04-04T03:20:35Z)
On Characterizing the Evolution of Embedding Space of Neural Networks using Algebraic Topology [9.537910170141467]
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value.
arXiv Detail & Related papers (2023-11-08T10:45:12Z)
Deterministic equivalent and error universality of deep random features learning [4.8461049669050915]
This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove universality of the test error in a universality ridge setting where the learner and target networks share the same intermediate layers, and provide a sharp formula for it. Second, we conjecture the universality of the test error in the more general setting of arbitrary convex losses and generic learner/target architectures.
arXiv Detail & Related papers (2023-02-01T12:37:10Z)
What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime. We prove that deep CNNs adapt to the spatial scale of the target function. We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
SPINE: Soft Piecewise Interpretable Neural Equations [0.0]
Fully connected networks are ubiquitous but uninterpretable. This paper takes a novel approach to piecewise fits by using set operations on individual pieces(parts) It can find a variety of applications where fully connected layers must be replaced by interpretable layers.
arXiv Detail & Related papers (2021-11-20T16:18:00Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
A Functional Perspective on Learning Symmetric Functions with Neural Networks [48.80300074254758]
We study the learning and representation of neural networks defined on measures. We establish approximation and generalization bounds under different choices of regularization. The resulting models can be learned efficiently and enjoy generalization guarantees that extend across input sizes.
arXiv Detail & Related papers (2020-08-16T16:34:33Z)
When Residual Learning Meets Dense Aggregation: Rethinking the Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations. Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.