Improving Sample Efficiency with Normalized RBF Kernels
- URL: http://arxiv.org/abs/2007.15397v2
- Date: Fri, 31 Jul 2020 09:25:55 GMT
- Title: Improving Sample Efficiency with Normalized RBF Kernels
- Authors: Sebastian Pineda-Arango, David Obando-Paniagua, Alperen Dedeoglu,
Philip Kurzend\"orfer, Friedemann Schestag and Randolf Scholz
- Abstract summary: This paper explores how neural networks with normalized Radial Basis Function (RBF) kernels can be trained to achieve better sample efficiency.
We show how this kind of output layer can find embedding spaces where the classes are compact and well-separated.
Experiments on CIFAR-10 and CIFAR-100 show that networks with normalized kernels as output layer can achieve higher sample efficiency, high compactness and well-separability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In deep learning models, learning more with less data is becoming more
important. This paper explores how neural networks with normalized Radial Basis
Function (RBF) kernels can be trained to achieve better sample efficiency.
Moreover, we show how this kind of output layer can find embedding spaces where
the classes are compact and well-separated. In order to achieve this, we
propose a two-phase method to train those type of neural networks on
classification tasks. Experiments on CIFAR-10 and CIFAR-100 show that networks
with normalized kernels as output layer can achieve higher sample efficiency,
high compactness and well-separability through the presented method in
comparison to networks with SoftMax output layer.
Related papers
- Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node [49.08777822540483]
Fast feedforward networks (FFFs) exploit the observation that different regions of the input space activate distinct subsets of neurons in wide networks.
We propose the incorporation of load balancing and Master Leaf techniques into the FFF architecture to improve performance and simplify the training process.
arXiv Detail & Related papers (2024-05-27T05:06:24Z) - Learning Sparse Neural Networks with Identity Layers [33.11654855515443]
We investigate the intrinsic link between network sparsity and interlayer feature similarity.
We propose a plug-and-play CKA-based Sparsity Regularization for sparse network training, dubbed CKA-SR.
We find that CKA-SR consistently improves the performance of several State-Of-The-Art sparse training methods.
arXiv Detail & Related papers (2023-07-14T14:58:44Z) - Kernel function impact on convolutional neural networks [10.98068123467568]
We study the usage of kernel functions at the different layers in a convolutional neural network.
We show how one can effectively leverage kernel functions, by introducing a more distortion aware pooling layers.
We propose Kernelized Dense Layers (KDL), which replace fully-connected layers.
arXiv Detail & Related papers (2023-02-20T19:57:01Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Boosting the Efficiency of Parametric Detection with Hierarchical Neural
Networks [4.1410005218338695]
We propose Hierarchical Detection Network (HDN), a novel approach to efficient detection.
The network is trained using a novel loss function, which encodes simultaneously the goals of statistical accuracy and efficiency.
We show how training a three-layer HDN using two-layer model can further boost both accuracy and efficiency.
arXiv Detail & Related papers (2022-07-23T19:23:00Z) - TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent
Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions.
We show this disparity can largely be attributed to challenges presented by non-NISTity.
We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Compare Where It Matters: Using Layer-Wise Regularization To Improve
Federated Learning on Heterogeneous Data [0.0]
Federated Learning is a widely adopted method to train neural networks over distributed data.
One main limitation is the performance degradation that occurs when data is heterogeneously distributed.
We present FedCKA: a framework that out-performs previous state-of-the-art methods on various deep learning tasks.
arXiv Detail & Related papers (2021-12-01T10:46:13Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Adaptive Convolution Kernel for Artificial Neural Networks [0.0]
This paper describes a method for training the size of convolutional kernels to provide varying size kernels in a single layer.
Experiments compared the proposed adaptive layers to ordinary convolution layers in a simple two-layer network.
A segmentation experiment in the Oxford-Pets dataset demonstrated that replacing a single ordinary convolution layer in a U-shaped network with a single 7$times$7 adaptive layer can improve its learning performance and ability to generalize.
arXiv Detail & Related papers (2020-09-14T12:36:50Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.