Related papers: Maxout Polytopes

Maxout Polytopes

URL: http://arxiv.org/abs/2509.21286v1
Date: Thu, 25 Sep 2025 15:06:10 GMT
Title: Maxout Polytopes
Authors: Andrei Balakin, Shelby Cox, Georg Loho, Bernd Sturmfels,
Abstract summary: Maxout polytopes are defined by feedforward neural networks with maxout activation function and non-negative weights after the first layer.<n>We characterize the parameter spaces and extremal f-vectors of maxout polytopes for shallow networks, and we study the separating hypersurfaces which arise when a layer is added to the network.
Score: 0.9857968274865206
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Maxout polytopes are defined by feedforward neural networks with maxout activation function and non-negative weights after the first layer. We characterize the parameter spaces and extremal f-vectors of maxout polytopes for shallow networks, and we study the separating hypersurfaces which arise when a layer is added to the network. We also show that maxout polytopes are cubical for generic networks without bottlenecks.

Related papers

On the expressivity of sparse maxout networks [0.6445605125467574]
We study the expressivity of sparse maxout networks, where each neuron takes a fixed number of inputs from the previous layer and employs a maxout activation.<n>We establish a duality between functions computable by such networks and a class of virtual polytopes, linking their geometry to questions of network expressivity.
arXiv Detail & Related papers (2025-10-15T20:18:18Z)
SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes [61.110517195874074]
We present a scheme to directly generate manifold, polygonal meshes of complex connectivity as the output of a neural network.<n>Our key innovation is to define a continuous latent connectivity space at each mesh, which implies the discrete mesh.<n>In applications, this approach not only yields high-quality outputs from generative models, but also enables directly learning challenging geometry processing tasks such as mesh repair.
arXiv Detail & Related papers (2024-09-30T17:59:03Z)
Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes [3.0079490585515343]
We show that $lceillog_(n)rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry.
arXiv Detail & Related papers (2023-02-24T10:14:53Z)
Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We show that linear networks make provably optimal predictions at infinite depth. We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z)
HSurf-Net: Normal Estimation for 3D Point Clouds by Learning Hyper Surfaces [54.77683371400133]
We propose a novel normal estimation method called HSurf-Net, which can accurately predict normals from point clouds with noise and density variations. Experimental results show that our HSurf-Net achieves the state-of-the-art performance on the synthetic shape dataset.
arXiv Detail & Related papers (2022-10-13T16:39:53Z)
Enumeration of max-pooling responses with generalized permutohedra [39.58317527488534]
Max-pooling layers are functions that downsample input arrays by taking the maximum over shifted windows of input coordinates. We characterize the faces of such polytopes and obtain generating functions and closed formulas for the number of vertices and facets in a 1D max-pooling layer depending on the size of the pooling windows and stride.
arXiv Detail & Related papers (2022-09-29T17:45:54Z)
Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums [1.1602089225841632]
A rank-k maxout unit is a function computing the maximum of $k$ linear functions. We present results on the number of linear regions of the functions that can be represented by artificial feedforward neural networks with maxout units.
arXiv Detail & Related papers (2021-04-16T14:33:21Z)
Revealing the Structure of Deep Neural Networks via Convex Duality [70.15611146583068]
We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of hidden layers. We show that a set of optimal hidden layer weights for a norm regularized training problem can be explicitly found as the extreme points of a convex set. We apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds.
arXiv Detail & Related papers (2020-02-22T21:13:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.