On Approximation in Deep Convolutional Networks: a Kernel Perspective
- URL: http://arxiv.org/abs/2102.10032v1
- Date: Fri, 19 Feb 2021 17:03:42 GMT
- Title: On Approximation in Deep Convolutional Networks: a Kernel Perspective
- Authors: Alberto Bietti
- Abstract summary: We study the success of deep convolutional networks on tasks involving high-dimensional data such as images or audio.
We study this theoretically and empirically through the lens of kernel methods, by considering multi-layer convolutional kernels.
We find that while expressive kernels operating on input patches are important at the first layer, simpler kernels can suffice in higher layers for good performance.
- Score: 12.284934135116515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of deep convolutional networks on on tasks involving
high-dimensional data such as images or audio suggests that they are able to
efficiently approximate certain classes of functions that are not cursed by
dimensionality. In this paper, we study this theoretically and empirically
through the lens of kernel methods, by considering multi-layer convolutional
kernels, which have achieved good empirical performance on standard vision
datasets, and provide theoretical descriptions of over-parameterized
convolutional networks in certain regimes. We find that while expressive
kernels operating on input patches are important at the first layer, simpler
polynomial kernels can suffice in higher layers for good performance. For such
simplified models, we provide a precise functional description of the RKHS and
its regularization properties, highlighting the role of depth for capturing
interactions between different parts of the input signal, and the role of
pooling for encouraging smooth dependence on the global or relative positions
of such parts.
Related papers
- Feature Mapping in Physics-Informed Neural Networks (PINNs) [1.9819034119774483]
We study the training dynamics of PINNs with a feature mapping layer via the limiting Conjugate Kernel and Neural Tangent Kernel.
We propose conditionally positive definite Radial Basis Function as a better alternative.
arXiv Detail & Related papers (2024-02-10T13:51:09Z) - Nonlinear functional regression by functional deep neural network with
kernel embedding [20.306390874610635]
We propose a functional deep neural network with an efficient and fully data-dependent dimension reduction method.
The architecture of our functional net consists of a kernel embedding step, a projection step, and a deep ReLU neural network for the prediction.
The utilization of smooth kernel embedding enables our functional net to be discretization invariant, efficient, and robust to noisy observations.
arXiv Detail & Related papers (2024-01-05T16:43:39Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Mechanism of feature learning in convolutional neural networks [14.612673151889615]
We identify the mechanism of how convolutional neural networks learn from image data.
We present empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs.
We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.
arXiv Detail & Related papers (2023-09-01T16:30:02Z) - Kernel function impact on convolutional neural networks [10.98068123467568]
We study the usage of kernel functions at the different layers in a convolutional neural network.
We show how one can effectively leverage kernel functions, by introducing a more distortion aware pooling layers.
We propose Kernelized Dense Layers (KDL), which replace fully-connected layers.
arXiv Detail & Related papers (2023-02-20T19:57:01Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Multiple Kernel Representation Learning on Networks [12.106994960669924]
We propose a weighted matrix factorization model that encodes random walk-based information about nodes of the network.
We extend the approach with a multiple kernel learning formulation that provides the flexibility of learning the kernel as the linear combination of a dictionary of kernels.
arXiv Detail & Related papers (2021-06-09T13:22:26Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.