Fast Estimation of Information Theoretic Learning Descriptors using
Explicit Inner Product Spaces
- URL: http://arxiv.org/abs/2001.00265v1
- Date: Wed, 1 Jan 2020 20:21:12 GMT
- Title: Fast Estimation of Information Theoretic Learning Descriptors using
Explicit Inner Product Spaces
- Authors: Kan Li and Jose C. Principe
- Abstract summary: Kernel methods form a theoretically-grounded, powerful and versatile framework to solve nonlinear problems in signal processing and machine learning.
Recently, we proposed emphno-trick (NT) kernel adaptive filtering (KAF)
We focus on a family of fast, scalable, and accurate estimators for ITL using explicit inner product space kernels.
- Score: 4.5497405861975935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Kernel methods form a theoretically-grounded, powerful and versatile
framework to solve nonlinear problems in signal processing and machine
learning. The standard approach relies on the \emph{kernel trick} to perform
pairwise evaluations of a kernel function, leading to scalability issues for
large datasets due to its linear and superlinear growth with respect to the
training data. Recently, we proposed \emph{no-trick} (NT) kernel adaptive
filtering (KAF) that leverages explicit feature space mappings using
data-independent basis with constant complexity. The inner product defined by
the feature mapping corresponds to a positive-definite finite-rank kernel that
induces a finite-dimensional reproducing kernel Hilbert space (RKHS).
Information theoretic learning (ITL) is a framework where information theory
descriptors based on non-parametric estimator of Renyi entropy replace
conventional second-order statistics for the design of adaptive systems. An
RKHS for ITL defined on a space of probability density functions simplifies
statistical inference for supervised or unsupervised learning. ITL criteria
take into account the higher-order statistical behavior of the systems and
signals as desired. However, this comes at a cost of increased computational
complexity. In this paper, we extend the NT kernel concept to ITL for improved
information extraction from the signal without compromising scalability.
Specifically, we focus on a family of fast, scalable, and accurate estimators
for ITL using explicit inner product space (EIPS) kernels. We demonstrate the
superior performance of EIPS-ITL estimators and combined NT-KAF using EIPS-ITL
cost functions through experiments.
Related papers
- Analytic Convolutional Layer: A Step to Analytic Neural Network [15.596391258983463]
Analytic Convolutional Layer (ACL) is a mosaic of analytical convolution kernels (ACKs) and traditional convolution kernels.
ACLs offer a means for neural network interpretation, thereby paving the way for the intrinsic interpretability of neural network.
arXiv Detail & Related papers (2024-07-03T07:10:54Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Enhancing Kernel Flexibility via Learning Asymmetric Locally-Adaptive
Kernels [35.76925787221499]
This paper introduces the concept of Locally-Adaptive-Bandwidths (LAB) as trainable parameters to enhance the Radial Basis Function (RBF) kernel.
The parameters in LAB RBF kernels are data-dependent, and its number can increase with the dataset.
This paper for the first time establishes an asymmetric kernel ridge regression framework and introduces an iterative kernel learning algorithm.
arXiv Detail & Related papers (2023-10-08T17:08:15Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data.
We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel.
We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Scaling Neural Tangent Kernels via Sketching and Random Features [53.57615759435126]
Recent works report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets.
We design a near input-sparsity time approximation algorithm for NTK, by sketching the expansions of arc-cosine kernels.
We show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup.
arXiv Detail & Related papers (2021-06-15T04:44:52Z) - The Fast Kernel Transform [21.001203328543006]
We propose the Fast Kernel Transform (FKT), a general algorithm to compute matrix-vector multiplications for datasets in moderate dimensions with quasilinear complexity.
The FKT is easily applied to a broad class of kernels, including Gaussian, Matern, and Rational Quadratic covariance functions and physically motivated Green's functions.
We illustrate the efficacy and versatility of the FKT by providing timing and accuracy benchmarks and by applying it to scale the neighborhood embedding (t-SNE) and Gaussian processes to large real-world data sets.
arXiv Detail & Related papers (2021-06-08T16:15:47Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - A Mean-Field Theory for Learning the Sch\"{o}nberg Measure of Radial
Basis Functions [13.503048325896174]
We learn the distribution in the Sch"onberg integral representation of the radial basis functions from training samples.
We prove that in the scaling limits, the empirical measure of the Langevin particles converges to the law of a reflected Ito diffusion-drift process.
arXiv Detail & Related papers (2020-06-23T21:04:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.