Identifiability of Deep Polynomial Neural Networks
- URL: http://arxiv.org/abs/2506.17093v1
- Date: Fri, 20 Jun 2025 15:58:46 GMT
- Title: Identifiability of Deep Polynomial Neural Networks
- Authors: Konstantin Usevich, Clara Dérand, Ricardo Borsoi, Marianne Clausel,
- Abstract summary: Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure.<n>Their identifiability -- a key property for ensuring interpretability -- remains poorly understood.<n>We present a comprehensive analysis of the identifiability of deep PNNs.
- Score: 2.747263800047993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. This yields both generic conditions determined by the architecture, and effective conditions that depend on the network's parameters. We also settle an open conjecture on the expected dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach its maximum.
Related papers
- Information-Theoretic Generalization Bounds for Deep Neural Networks [20.015357820733406]
Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications.<n>This work aims to capture the effect and benefits of depth for supervised learning via information-theoretic generalization bounds.
arXiv Detail & Related papers (2024-04-04T03:20:35Z) - Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - Model-Agnostic Reachability Analysis on Deep Neural Networks [25.54542656637704]
We develop a model-agnostic verification framework, called DeepAgn.
It can be applied to FNNs, Recurrent Neural Networks (RNNs), or a mixture of both.
It does not require access to the network's internal structures, such as layers and parameters.
arXiv Detail & Related papers (2023-04-03T09:01:59Z) - On the Intrinsic Structures of Spiking Neural Networks [66.57589494713515]
Recent years have emerged a surge of interest in SNNs owing to their remarkable potential to handle time-dependent and event-driven data.
There has been a dearth of comprehensive studies examining the impact of intrinsic structures within spiking computations.
This work delves deep into the intrinsic structures of SNNs, by elucidating their influence on the expressivity of SNNs.
arXiv Detail & Related papers (2022-06-21T09:42:30Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Deep Networks Provably Classify Data on Curves [12.309532551321334]
We study a model problem that uses a deep fully-connected neural network to classify data drawn from two disjoint smooth curves on the unit sphere.
We prove that when (i) the network depth is large to certain properties that set the difficulty of the problem and (ii) the network width and number of samples is intrinsic in the relative depth, randomly-d gradient descent quickly learns to correctly classify all points on the two curves with high probability.
arXiv Detail & Related papers (2021-07-29T20:40:04Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.