Related papers: Kernel function impact on convolutional neural networks

Kernel function impact on convolutional neural networks

URL: http://arxiv.org/abs/2302.10266v1
Date: Mon, 20 Feb 2023 19:57:01 GMT
Title: Kernel function impact on convolutional neural networks
Authors: M.Amine Mahmoudi, Aladine Chetouani, Fatma Boufera, Hedi Tabia
Abstract summary: We study the usage of kernel functions at the different layers in a convolutional neural network. We show how one can effectively leverage kernel functions, by introducing a more distortion aware pooling layers. We propose Kernelized Dense Layers (KDL), which replace fully-connected layers.
Score: 10.98068123467568
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper investigates the usage of kernel functions at the different layers in a convolutional neural network. We carry out extensive studies of their impact on convolutional, pooling and fully-connected layers. We notice that the linear kernel may not be sufficiently effective to fit the input data distributions, whereas high order kernels prone to over-fitting. This leads to conclude that a trade-off between complexity and performance should be reached. We show how one can effectively leverage kernel functions, by introducing a more distortion aware pooling layers which reduces over-fitting while keeping track of the majority of the information fed into subsequent layers. We further propose Kernelized Dense Layers (KDL), which replace fully-connected layers, and capture higher order feature interactions. The experiments on conventional classification datasets i.e. MNIST, FASHION-MNIST and CIFAR-10, show that the proposed techniques improve the performance of the network compared to classical convolution, pooling and fully connected layers. Moreover, experiments on fine-grained classification i.e. facial expression databases, namely RAF-DB, FER2013 and ExpW demonstrate that the discriminative power of the network is boosted, since the proposed techniques improve the awareness to slight visual details and allows the network reaching state-of-the-art results.

Related papers

Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach [65.47969413708344]
We introduce the concept of CF twins and design a conditional generative diffusion model (CGDM)<n>We employ a variational inference technique to derive the evidence lower bound (ELBO) for the log-marginal distribution of the observed fine-grained CF conditioned on the coarse-grained CF.<n>We show that the proposed approach exhibits significant improvement in reconstruction performance compared to the baselines.
arXiv Detail & Related papers (2025-05-12T01:36:06Z)
Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
Mechanism of feature learning in convolutional neural networks [14.612673151889615]
We identify the mechanism of how convolutional neural networks learn from image data. We present empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs. We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.
arXiv Detail & Related papers (2023-09-01T16:30:02Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Compare Where It Matters: Using Layer-Wise Regularization To Improve Federated Learning on Heterogeneous Data [0.0]
Federated Learning is a widely adopted method to train neural networks over distributed data. One main limitation is the performance degradation that occurs when data is heterogeneously distributed. We present FedCKA: a framework that out-performs previous state-of-the-art methods on various deep learning tasks.
arXiv Detail & Related papers (2021-12-01T10:46:13Z)
Understanding the Distributions of Aggregation Layers in Deep Neural Networks [8.784438985280092]
aggregation functions as an important mechanism for consolidating deep features into a more compact representation. In particular, the proximity of global aggregation layers to the output layers of DNNs mean that aggregated features have a direct influence on the performance of a deep net. We propose a novel mathematical formulation for analytically modelling the probability distributions of output values of layers involved with deep feature aggregation.
arXiv Detail & Related papers (2021-07-09T14:23:57Z)
Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent. We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network. We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z)
On Approximation in Deep Convolutional Networks: a Kernel Perspective [12.284934135116515]
We study the success of deep convolutional networks on tasks involving high-dimensional data such as images or audio. We study this theoretically and empirically through the lens of kernel methods, by considering multi-layer convolutional kernels. We find that while expressive kernels operating on input patches are important at the first layer, simpler kernels can suffice in higher layers for good performance.
arXiv Detail & Related papers (2021-02-19T17:03:42Z)
Kernelized dense layers for facial expression recognition [10.98068123467568]
We propose a Kernelized Dense Layer (KDL) which captures higher order feature interactions instead of conventional linear relations. We show that our model achieves competitive results with respect to the state-of-the-art approaches.
arXiv Detail & Related papers (2020-09-22T21:02:00Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
Finite Versus Infinite Neural Networks: an Empirical Study [69.07049353209463]
kernel methods outperform fully-connected finite-width networks. Centered and ensembled finite networks have reduced posterior variance. Weight decay and the use of a large learning rate break the correspondence between finite and infinite networks.
arXiv Detail & Related papers (2020-07-31T01:57:47Z)
Convolutional Networks with Dense Connectivity [59.30634544498946]
We introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks.
arXiv Detail & Related papers (2020-01-08T06:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.