On the VC dimension of deep group convolutional neural networks
- URL: http://arxiv.org/abs/2410.15800v1
- Date: Mon, 21 Oct 2024 09:16:06 GMT
- Title: On the VC dimension of deep group convolutional neural networks
- Authors: Anna Sepliarskaia, Sophie Langer, Johannes Schmidt-Hieber,
- Abstract summary: We study the generalization capabilities of Group Convolutional Neural Networks (GCNNs) with ReLU activation function.
We analyze how factors such as the number of layers, weights, and input dimension affect the Vapnik-Chervonenkis (VC) dimension.
- Score: 7.237068561453083
- License:
- Abstract: We study the generalization capabilities of Group Convolutional Neural Networks (GCNNs) with ReLU activation function by deriving upper and lower bounds for their Vapnik-Chervonenkis (VC) dimension. Specifically, we analyze how factors such as the number of layers, weights, and input dimension affect the VC dimension. We further compare the derived bounds to those known for other types of neural networks. Our findings extend previous results on the VC dimension of continuous GCNNs with two layers, thereby providing new insights into the generalization properties of GCNNs, particularly regarding the dependence on the input resolution of the data.
Related papers
- A note on the VC dimension of 1-dimensional GNNs [6.0757501646966965]
Graph Neural Networks (GNNs) have become an essential tool for analyzing graph-structured data.
This paper focuses on the generalization of GNNs by investigating their Vapnik-Chervonenkis (VC) dimension.
arXiv Detail & Related papers (2024-10-10T11:33:15Z) - Unveiling the Unseen: Identifiable Clusters in Trained Depthwise
Convolutional Kernels [56.69755544814834]
Recent advances in depthwise-separable convolutional neural networks (DS-CNNs) have led to novel architectures.
This paper reveals another striking property of DS-CNN architectures: discernible and explainable patterns emerge in their trained depthwise convolutional kernels in all layers.
arXiv Detail & Related papers (2024-01-25T19:05:53Z) - Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks [22.083873334272027]
We observe that sparser networks outperform the non-sparse networks at shallow depths on a variety of datasets.
We extend the existing theory on the generalization error of kernel-ridge regression.
arXiv Detail & Related papers (2023-05-17T20:09:35Z) - Transferability of coVariance Neural Networks and Application to
Interpretable Brain Age Prediction using Anatomical Features [119.45320143101381]
Graph convolutional networks (GCN) leverage topology-driven graph convolutional operations to combine information across the graph for inference tasks.
We have studied GCNs with covariance matrices as graphs in the form of coVariance neural networks (VNNs)
VNNs inherit the scale-free data processing architecture from GCNs and here, we show that VNNs exhibit transferability of performance over datasets whose covariance matrices converge to a limit object.
arXiv Detail & Related papers (2023-05-02T22:15:54Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Predicting Brain Age using Transferable coVariance Neural Networks [119.45320143101381]
We have recently studied covariance neural networks (VNNs) that operate on sample covariance matrices.
In this paper, we demonstrate the utility of VNNs in inferring brain age using cortical thickness data.
Our results show that VNNs exhibit multi-scale and multi-site transferability for inferring brain age
In the context of brain age in Alzheimer's disease (AD), our experiments show that i) VNN outputs are interpretable as brain age predicted using VNNs is significantly elevated for AD with respect to healthy subjects.
arXiv Detail & Related papers (2022-10-28T18:58:34Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - VC dimension of partially quantized neural networks in the
overparametrized regime [8.854725253233333]
We focus on a class of partially quantized networks that we refer to as hyperplane arrangement neural networks (HANNs)
We show that HANNs can have VC dimension significantly smaller than the number of weights, while being highly expressive.
On a panel of 121 UCI datasets, overparametrized HANNs match the performance of state-of-the-art full-precision models.
arXiv Detail & Related papers (2021-10-06T02:02:35Z) - Non-asymptotic Excess Risk Bounds for Classification with Deep
Convolutional Neural Networks [6.051520664893158]
We consider the problem of binary classification with a class of general deep convolutional neural networks.
We define the prefactors of the risk bounds in terms of the input data dimension and other model parameters.
We show that the classification methods with CNNs can circumvent the curse of dimensionality.
arXiv Detail & Related papers (2021-05-01T15:55:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.