Related papers: NN2Poly: A polynomial representation for deep feed-forward artificial neural networks

NN2Poly: A polynomial representation for deep feed-forward artificial neural networks

URL: http://arxiv.org/abs/2112.11397v4
Date: Mon, 25 Sep 2023 18:33:01 GMT
Title: NN2Poly: A polynomial representation for deep feed-forward artificial neural networks
Authors: Pablo Morala (1 and 2), Jenny Alexandra Cifuentes (3), Rosa E. Lillo (1 and 2), I\~naki Ucar (1 and 2) ((1) uc3m-Santander Big Data Institute, Universidad Carlos III de Madrid. Spain., (2) Department of Statistics, Universidad Carlos III de Madrid. Spain., (3) ICADE, Department of Quantitative Methods, Faculty of Economics and Business Administration, Universidad Pontificia Comillas. Spain.)
Abstract summary: NN2Poly is a theoretical approach to obtain an explicit model of an already trained fully-connected feed-forward artificial neural network. This approach extends a previous idea proposed in the literature, which was limited to single hidden layer networks.
Score: 0.6502001911298337
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Interpretability of neural networks and their underlying theoretical behavior remain an open field of study even after the great success of their practical applications, particularly with the emergence of deep learning. In this work, NN2Poly is proposed: a theoretical approach to obtain an explicit polynomial model that provides an accurate representation of an already trained fully-connected feed-forward artificial neural network (a multilayer perceptron or MLP). This approach extends a previous idea proposed in the literature, which was limited to single hidden layer networks, to work with arbitrarily deep MLPs in both regression and classification tasks. NN2Poly uses a Taylor expansion on the activation function, at each layer, and then applies several combinatorial properties to calculate the coefficients of the desired polynomials. Discussion is presented on the main computational challenges of this method, and the way to overcome them by imposing certain constraints during the training phase. Finally, simulation experiments as well as applications to real tabular data sets are presented to demonstrate the effectiveness of the proposed method.

Related papers

Approximating Latent Manifolds in Neural Networks via Vanishing Ideals [20.464009622419766]
We establish a connection between manifold learning and computational algebra by demonstrating how vanishing ideals can characterize the latent manifold of deep networks. We propose a new neural architecture that truncates a pretrained network at an intermediate layer, and approximates each class manifold via generators of the vanishing ideal. The resulting models have significantly fewer layers than their pretrained baselines, while maintaining comparable accuracy, achieving higher throughput and utilizing fewer parameters.
arXiv Detail & Related papers (2025-02-20T21:23:02Z)
Neural DNF-MT: A Neuro-symbolic Approach for Learning Interpretable and Editable Policies [51.03989561425833]
We propose a neuro-symbolic approach called neural DNF-MT for end-to-end policy learning. The differentiable nature of the neural DNF-MT model enables the use of deep actor-critic algorithms for training. We show how the bivalent representations of deterministic policies can be edited and incorporated back into a neural model.
arXiv Detail & Related papers (2025-01-07T15:51:49Z)
Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z)
A practical existence theorem for reduced order models based on convolutional autoencoders [0.4604003661048266]
Deep learning has gained increasing popularity in the fields of Partial Differential Equations (PDEs) and Reduced Order Modeling (ROM) CNN-based autoencoders have proven extremely effective, outperforming established techniques, such as the reduced basis method, when dealing with complex nonlinear problems. We provide a new practical existence theorem for CNN-based autoencoders when the parameter-to-solution map is holomorphic.
arXiv Detail & Related papers (2024-02-01T09:01:58Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
[Experiments & Analysis] Evaluating the Feasibility of Sampling-Based Techniques for Training Multilayer Perceptrons [10.145355763143218]
Several sampling-based techniques have been proposed for speeding up the training time of deep neural networks. These techniques fall under two categories: (i) sampling a subset of nodes in every hidden layer as active at every iteration and (ii) sampling a subset of nodes from the previous layer to approximate the current layer's activations. In this paper, we evaluate the feasibility of these approaches on CPU machines with limited computational resources.
arXiv Detail & Related papers (2023-06-15T17:19:48Z)
When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning. Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z)
Limitations of neural network training due to numerical instability of backpropagation [2.255961793913651]
We study the training of deep neural networks by gradient descent where floating-point arithmetic is used to compute gradients. It is highly unlikely to find ReLU neural networks that maintain, in the course of training with gradient descent, superlinearly many affine pieces with respect to their number of layers. We conclude that approximating sequences of ReLU neural networks resulting from gradient descent in practice differ substantially from theoretically constructed sequences.
arXiv Detail & Related papers (2022-10-03T10:34:38Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Towards a mathematical framework to inform Neural Network modelling via Polynomial Regression [0.0]
It is shown that almost identical predictions can be made when certain conditions are met locally. When learning from generated data, the proposed method producess that approximate correctly the data locally.
arXiv Detail & Related papers (2021-02-07T17:56:16Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.