Related papers: On the Geometry and Optimization of Polynomial Convolutional Networks

On the Geometry and Optimization of Polynomial Convolutional Networks

URL: http://arxiv.org/abs/2410.00722v1
Date: Tue, 1 Oct 2024 14:13:05 GMT
Title: On the Geometry and Optimization of Polynomial Convolutional Networks
Authors: Vahid Shahverdi, Giovanni Luca Marchetti, Kathlén Kohn,
Abstract summary: We study convolutional neural networks with monomial activation functions. We compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model. For a generic large dataset, we derive an explicit formula that quantifies the number of critical points arising in the optimization of a regression loss.
Score: 2.9816332334719773
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study convolutional neural networks with monomial activation functions. Specifically, we prove that their parameterization map is regular and is an isomorphism almost everywhere, up to rescaling the filters. By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map -- typically referred to as neuromanifold. In particular, we compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model, and describe its singularities. Moreover, for a generic large dataset, we derive an explicit formula that quantifies the number of critical points arising in the optimization of a regression loss.

Related papers

SVarM: Linear Support Varifold Machines for Classification and Regression on Geometric Data [4.212663349859165]
This work proposes SVarM to exploit varifold representations of shapes as measures and their duality with test functions.<n>We develop classification and regression models on shape datasets by introducing a neural network-based representation of the trainable test function.
arXiv Detail & Related papers (2025-06-01T21:55:15Z)
Geometry of Lightning Self-Attention: Identifiability and Dimension [2.9816332334719773]
We study the identifiability of deep attention by providing a description of the generic fibers of the parametrization for an arbitrary number of layers. For a single-layer model, we characterize the singular and boundary points. Finally, we formulate a conjectural extension of our results to normalized self-attention networks, prove it for a single layer, and numerically verify it in the deep case.
arXiv Detail & Related papers (2024-08-30T12:00:36Z)
Geometric Generative Models based on Morphological Equivariant PDEs and GANs [3.6498648388765513]
We propose a geometric generative model based on an equivariant partial differential equation (PDE) for group convolution neural networks (G-CNNs) The proposed geometric morphological GAN (GM-GAN) is obtained by using the proposed morphological equivariant convolutions in PDE-G-CNNs. Preliminary results show that GM-GAN model outperforms classical GAN.
arXiv Detail & Related papers (2024-03-22T01:02:09Z)
Towards a mathematical understanding of learning from few examples with nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points. We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z)
Neural Eigenfunctions Are Structured Representation Learners [93.53445940137618]
This paper introduces a structured, adaptive-length deep representation called Neural Eigenmap. We show that, when the eigenfunction is derived from positive relations in a data augmentation setup, applying NeuralEF results in an objective function. We demonstrate using such representations as adaptive-length codes in image retrieval systems.
arXiv Detail & Related papers (2022-10-23T07:17:55Z)
The Manifold Scattering Transform for High-Dimensional Point Cloud Data [16.500568323161563]
We present practical schemes for implementing the manifold scattering transform to datasets arising in naturalistic systems. We show that our methods are effective for signal classification and manifold classification tasks.
arXiv Detail & Related papers (2022-06-21T02:15:00Z)
A singular Riemannian geometry approach to Deep Neural Networks I. Theoretical foundations [77.86290991564829]
Deep Neural Networks are widely used for solving complex problems in several scientific areas, such as speech recognition, machine translation, image analysis. We study a particular sequence of maps between manifold, with the last manifold of the sequence equipped with a Riemannian metric. We investigate the theoretical properties of the maps of such sequence, eventually we focus on the case of maps between implementing neural networks of practical interest.
arXiv Detail & Related papers (2021-12-17T11:43:30Z)
Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations. We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z)
Geometry of Linear Convolutional Networks [7.990816079551592]
We study the family of functions represented by a linear convolutional neural network (LCN) We study the optimization of an objective function over an LCN, analyzing critical points in function space and in gradient space. Overall, our theory predicts that the optimized parameters of an LCN will often correspond to repeated filters across layers.
arXiv Detail & Related papers (2021-08-03T14:42:18Z)
Statistical Mechanics of Neural Processing of Object Manifolds [3.4809730725241605]
This thesis lays the groundwork for a computational theory of neuronal processing of objects. We identify that the capacity of a manifold is determined that effective radius, R_M, and effective dimension, D_M.
arXiv Detail & Related papers (2021-06-01T20:49:14Z)
Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs [81.12344211998635]
A common approach to define convolutions on meshes is to interpret them as a graph and apply graph convolutional networks (GCNs) We propose Gauge Equivariant Mesh CNNs which generalize GCNs to apply anisotropic gauge equivariant kernels. Our experiments validate the significantly improved expressivity of the proposed model over conventional GCNs and other methods.
arXiv Detail & Related papers (2020-03-11T17:21:15Z)
Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks. We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set. In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.