On the expressivity of sparse maxout networks
- URL: http://arxiv.org/abs/2510.14068v1
- Date: Wed, 15 Oct 2025 20:18:18 GMT
- Title: On the expressivity of sparse maxout networks
- Authors: Moritz Grillo, Tobias Hofmann,
- Abstract summary: We study the expressivity of sparse maxout networks, where each neuron takes a fixed number of inputs from the previous layer and employs a maxout activation.<n>We establish a duality between functions computable by such networks and a class of virtual polytopes, linking their geometry to questions of network expressivity.
- Score: 0.6445605125467574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the expressivity of sparse maxout networks, where each neuron takes a fixed number of inputs from the previous layer and employs a, possibly multi-argument, maxout activation. This setting captures key characteristics of convolutional or graph neural networks. We establish a duality between functions computable by such networks and a class of virtual polytopes, linking their geometry to questions of network expressivity. In particular, we derive a tight bound on the dimension of the associated polytopes, which serves as the central tool for our analysis. Building on this, we construct a sequence of depth hierarchies. While sufficiently deep sparse maxout networks are universal, we prove that if the required depth is not reached, width alone cannot compensate for the sparsity of a fixed indegree constraint.
Related papers
- Dense Neural Networks are not Universal Approximators [53.27010448621372]
We show that dense neural networks do not possess universality of arbitrary continuous functions.<n>We consider ReLU neural networks subject to natural constraints on weights and input and output dimensions.
arXiv Detail & Related papers (2026-02-07T16:52:38Z) - Maxout Polytopes [0.9857968274865206]
Maxout polytopes are defined by feedforward neural networks with maxout activation function and non-negative weights after the first layer.<n>We characterize the parameter spaces and extremal f-vectors of maxout polytopes for shallow networks, and we study the separating hypersurfaces which arise when a layer is added to the network.
arXiv Detail & Related papers (2025-09-25T15:06:10Z) - Asymptotics of Learning with Deep Structured (Random) Features [9.366617422860543]
For a large class of feature maps we provide a tight characterisation of the test error associated with learning the readout layer.
In some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
arXiv Detail & Related papers (2024-02-21T18:35:27Z) - Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - Regressions on quantum neural networks at maximal expressivity [0.0]
We analyze the expressivity of a universal deep neural network that can be organized as a series of nested qubit rotations.
The maximal expressive power increases with the depth of the network and the number of qubits, but is fundamentally bounded by the data encoding mechanism.
arXiv Detail & Related papers (2023-11-10T14:43:24Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory.
We show that linear networks make provably optimal predictions at infinite depth.
We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - On Infinite-Width Hypernetworks [101.03630454105621]
We show that hypernetworks do not guarantee to a global minima under descent.
We identify the functional priors of these architectures by deriving their corresponding GP and NTK kernels.
As part of this study, we make a mathematical contribution by deriving tight bounds on high order Taylor terms of standard fully connected ReLU networks.
arXiv Detail & Related papers (2020-03-27T00:50:29Z) - Quasi-Equivalence of Width and Depth of Neural Networks [10.365556153676538]
We investigate if the design of artificial neural networks should have a directional preference.
Inspired by the De Morgan law, we establish a quasi-equivalence between the width and depth of ReLU networks.
Based on our findings, a deep network has a wide equivalent, subject to an arbitrarily small error.
arXiv Detail & Related papers (2020-02-06T21:17:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.