Deep Maxout Network Gaussian Process
- URL: http://arxiv.org/abs/2208.04468v1
- Date: Mon, 8 Aug 2022 23:52:26 GMT
- Title: Deep Maxout Network Gaussian Process
- Authors: Libin Liang, Ye Tian and Ge Cheng
- Abstract summary: We derive the equivalence of the deep, infinite-width maxout network and the Gaussian process (GP)
We build up the connection between our deep maxout network kernel and deep neural network kernels.
- Score: 1.9292807030801753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Study of neural networks with infinite width is important for better
understanding of the neural network in practical application. In this work, we
derive the equivalence of the deep, infinite-width maxout network and the
Gaussian process (GP) and characterize the maxout kernel with a compositional
structure. Moreover, we build up the connection between our deep maxout network
kernel and deep neural network kernels. We also give an efficient numerical
implementation of our kernel which can be adapted to any maxout rank. Numerical
results show that doing Bayesian inference based on the deep maxout network
kernel can lead to competitive results compared with their finite-width
counterparts and deep neural network kernels. This enlightens us that the
maxout activation may also be incorporated into other infinite-width neural
network structures such as the convolutional neural network (CNN).
Related papers
- Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks [22.083873334272027]
We observe that sparser networks outperform the non-sparse networks at shallow depths on a variety of datasets.
We extend the existing theory on the generalization error of kernel-ridge regression.
arXiv Detail & Related papers (2023-05-17T20:09:35Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - Why Quantization Improves Generalization: NTK of Binary Weight Neural
Networks [33.08636537654596]
We take the binary weights in a neural network as random variables under rounding, and study the distribution propagation over different layers in the neural network.
We propose a quasi neural network to approximate the distribution propagation, which is a neural network with continuous parameters and smooth activation function.
arXiv Detail & Related papers (2022-06-13T06:11:21Z) - Incorporating Prior Knowledge into Neural Networks through an Implicit
Composite Kernel [1.6383321867266318]
Implicit Composite Kernel (ICK) is a kernel that combines a kernel implicitly defined by a neural network with a second kernel function chosen to model known properties.
We demonstrate ICK's superior performance and flexibility on both synthetic and real-world data sets.
arXiv Detail & Related papers (2022-05-15T21:32:44Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Neural Network Gaussian Processes by Increasing Depth [0.6091702876917281]
We show that increasing the depth of a neural network can give rise to a Gaussian process.
We also theoretically characterize its uniform tightness property and the smallest eigenvalue of its associated kernel.
These characterizations can not only enhance our understanding of the proposed depth-induced Gaussian processes, but also pave the way for future applications.
arXiv Detail & Related papers (2021-08-29T15:37:26Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Finite Versus Infinite Neural Networks: an Empirical Study [69.07049353209463]
kernel methods outperform fully-connected finite-width networks.
Centered and ensembled finite networks have reduced posterior variance.
Weight decay and the use of a large learning rate break the correspondence between finite and infinite networks.
arXiv Detail & Related papers (2020-07-31T01:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.