Understanding Convolutional Neural Networks from Theoretical Perspective
via Volterra Convolution
- URL: http://arxiv.org/abs/2110.09902v1
- Date: Tue, 19 Oct 2021 12:07:46 GMT
- Title: Understanding Convolutional Neural Networks from Theoretical Perspective
via Volterra Convolution
- Authors: Tenghui Li and Guoxu Zhou and Yuning Qiu and Qibin Zhao
- Abstract summary: This study explores the relationship between convolutional neural networks and finite Volterra convolutions.
It provides a novel approach to explain and study the overall characteristics of neural networks without being disturbed by the complex network architectures.
- Score: 22.058311878382142
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study proposes a general and unified perspective of convolutional neural
networks by exploring the relationship between (deep) convolutional neural
networks and finite Volterra convolutions. It provides a novel approach to
explain and study the overall characteristics of neural networks without being
disturbed by the complex network architectures. Concretely, we examine the
basic structures of finite term Volterra convolutions and convolutional neural
networks. Our results show that convolutional neural network is an
approximation of the finite term Volterra convolution, whose order increases
exponentially with the number of layers and kernel size increases exponentially
with the strides. With this perspective, the specialized perturbations are
directly obtained from the approximated kernels rather than iterative generated
adversarial examples. Extensive experiments on synthetic and real-world data
sets show the correctness and effectiveness of our results.
Related papers
- Collective variables of neural networks: empirical time evolution and scaling laws [0.535514140374842]
We show that certain measures on the spectrum of the empirical neural tangent kernel, specifically entropy and trace, yield insight into the representations learned by a neural network.
Results are demonstrated first on test cases before being shown on more complex networks, including transformers, auto-encoders, graph neural networks, and reinforcement learning studies.
arXiv Detail & Related papers (2024-10-09T21:37:14Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Graph Convolutional Networks from the Perspective of Sheaves and the
Neural Tangent Kernel [0.0]
Graph convolutional networks are a popular class of deep neural network algorithms.
Despite their success, graph convolutional networks exhibit a number of peculiar features, including a bias towards learning oversmoothed and homophilic functions.
We propose to bridge this gap by studying the neural tangent kernel of sheaf convolutional networks.
arXiv Detail & Related papers (2022-08-19T12:46:49Z) - Optimal Learning Rates of Deep Convolutional Neural Networks: Additive
Ridge Functions [19.762318115851617]
We consider the mean squared error analysis for deep convolutional neural networks.
We show that, for additive ridge functions, convolutional neural networks followed by one fully connected layer with ReLU activation functions can reach optimal mini-max rates.
arXiv Detail & Related papers (2022-02-24T14:22:32Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Generalization bound of globally optimal non-convex neural network
training: Transportation map estimation by infinite dimensional Langevin
dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error.
Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z) - Expressivity of Deep Neural Networks [2.7909470193274593]
In this review paper, we give a comprehensive overview of the large variety of approximation results for neural networks.
While the mainbody of existing results is for general feedforward architectures, we also depict approximation results for convolutional, residual and recurrent neural networks.
arXiv Detail & Related papers (2020-07-09T13:08:01Z) - Topological Insights into Sparse Neural Networks [16.515620374178535]
We introduce an approach to understand and compare sparse neural network topologies from the perspective of graph theory.
We first propose Neural Network Sparse Topology Distance (NNSTD) to measure the distance between different sparse neural networks.
We show that adaptive sparse connectivity can always unveil a plenitude of sparse sub-networks with very different topologies which outperform the dense model.
arXiv Detail & Related papers (2020-06-24T22:27:21Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.