Learning with convolution and pooling operations in kernel methods
- URL: http://arxiv.org/abs/2111.08308v1
- Date: Tue, 16 Nov 2021 09:00:44 GMT
- Title: Learning with convolution and pooling operations in kernel methods
- Authors: Theodor Misiakiewicz, Song Mei
- Abstract summary: Recent empirical work has shown that hierarchical convolutional kernels improve the performance of kernel methods in image classification tasks.
We study the precise interplay between approximation and generalization in convolutional architectures.
Our results quantify how choosing an architecture adapted to the target function leads to a large improvement in the sample complexity.
- Score: 8.528384027684192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent empirical work has shown that hierarchical convolutional kernels
inspired by convolutional neural networks (CNNs) significantly improve the
performance of kernel methods in image classification tasks. A widely accepted
explanation for the success of these architectures is that they encode
hypothesis classes that are suitable for natural images. However, understanding
the precise interplay between approximation and generalization in convolutional
architectures remains a challenge. In this paper, we consider the stylized
setting of covariates (image pixels) uniformly distributed on the hypercube,
and fully characterize the RKHS of kernels composed of single layers of
convolution, pooling, and downsampling operations. We then study the gain in
sample efficiency of kernel methods using these kernels over standard
inner-product kernels. In particular, we show that 1) the convolution layer
breaks the curse of dimensionality by restricting the RKHS to `local'
functions; 2) local pooling biases learning towards low-frequency functions,
which are stable by small translations; 3) downsampling may modify the
high-frequency eigenspaces but leaves the low-frequency part approximately
unchanged. Notably, our results quantify how choosing an architecture adapted
to the target function leads to a large improvement in the sample complexity.
Related papers
- LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - Neural Operators with Localized Integral and Differential Kernels [77.76991758980003]
We present a principled approach to operator learning that can capture local features under two frameworks.
We prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs.
To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions.
arXiv Detail & Related papers (2024-02-26T18:59:31Z) - Padding-free Convolution based on Preservation of Differential
Characteristics of Kernels [1.3597551064547502]
We present a non-padding-based method for size-keeping convolution based on the preservation of differential characteristics of kernels.
The main idea is to make convolution over an incomplete sliding window "collapse" to a linear differential operator evaluated locally at its central pixel.
arXiv Detail & Related papers (2023-09-12T16:36:12Z) - Mechanism of feature learning in convolutional neural networks [14.612673151889615]
We identify the mechanism of how convolutional neural networks learn from image data.
We present empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs.
We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.
arXiv Detail & Related papers (2023-09-01T16:30:02Z) - Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image
Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising.
We propose rank-enhanced low-dimensional convolution set (Re-ConvSet)
We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z) - Learning "best" kernels from data in Gaussian process regression. With
application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques.
A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning.
A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Learning strides in convolutional neural networks [34.20666933112202]
This work introduces DiffStride, the first downsampling layer with learnable strides.
Experiments on audio and image classification show the generality and effectiveness of our solution.
arXiv Detail & Related papers (2022-02-03T16:03:36Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Flow-based Kernel Prior with Application to Blind Super-Resolution [143.21527713002354]
Kernel estimation is generally one of the key problems for blind image super-resolution (SR)
This paper proposes a normalizing flow-based kernel prior (FKP) for kernel modeling.
Experiments on synthetic and real-world images demonstrate that the proposed FKP can significantly improve the kernel estimation accuracy.
arXiv Detail & Related papers (2021-03-29T22:37:06Z) - On Approximation in Deep Convolutional Networks: a Kernel Perspective [12.284934135116515]
We study the success of deep convolutional networks on tasks involving high-dimensional data such as images or audio.
We study this theoretically and empirically through the lens of kernel methods, by considering multi-layer convolutional kernels.
We find that while expressive kernels operating on input patches are important at the first layer, simpler kernels can suffice in higher layers for good performance.
arXiv Detail & Related papers (2021-02-19T17:03:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.