Towards a Unified Quadrature Framework for Large-Scale Kernel Machines
- URL: http://arxiv.org/abs/2011.01668v2
- Date: Thu, 10 Jun 2021 19:29:22 GMT
- Title: Towards a Unified Quadrature Framework for Large-Scale Kernel Machines
- Authors: Fanghui Liu, Xiaolin Huang, Yudong Chen, and Johan A.K. Suykens
- Abstract summary: We develop a quadrature framework for large-scale kernel machines via a numerical integration representation.
We leverage fully symmetric interpolatory rules to efficiently compute quadrature nodes and associated weights for kernel approximation.
- Score: 35.32894170512829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop a quadrature framework for large-scale kernel
machines via a numerical integration representation. Considering that the
integration domain and measure of typical kernels, e.g., Gaussian kernels,
arc-cosine kernels, are fully symmetric, we leverage deterministic fully
symmetric interpolatory rules to efficiently compute quadrature nodes and
associated weights for kernel approximation. The developed interpolatory rules
are able to reduce the number of needed nodes while retaining a high
approximation accuracy. Further, we randomize the above deterministic rules by
the classical Monte-Carlo sampling and control variates techniques with two
merits: 1) The proposed stochastic rules make the dimension of the feature
mapping flexibly varying, such that we can control the discrepancy between the
original and approximate kernels by tuning the dimnension. 2) Our stochastic
rules have nice statistical properties of unbiasedness and variance reduction
with fast convergence rate. In addition, we elucidate the relationship between
our deterministic/stochastic interpolatory rules and current quadrature rules
for kernel approximation, including the sparse grids quadrature and stochastic
spherical-radial rules, thereby unifying these methods under our framework.
Experimental results on several benchmark datasets show that our methods
compare favorably with other representative kernel approximation based methods.
Related papers
- Variance-Reducing Couplings for Random Features [57.73648780299374]
Random features (RFs) are a popular technique to scale up kernel methods in machine learning.
We find couplings to improve RFs defined on both Euclidean and discrete input spaces.
We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm.
arXiv Detail & Related papers (2024-05-26T12:25:09Z) - Simplex Random Features [53.97976744884616]
We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels.
We prove that SimRFs provide the smallest possible mean square error (MSE) on unbiased estimates of these kernels.
We show consistent gains provided by SimRFs in settings including pointwise kernel estimation, nonparametric classification and scalable Transformers.
arXiv Detail & Related papers (2023-01-31T18:53:39Z) - Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.
In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)
For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z) - Random Fourier Features for Asymmetric Kernels [24.20121243104385]
We introduce a complex measure with the real and imaginary parts corresponding to four finite positive measures, which expands the application scope of the Bochner theorem.
This framework allows for handling classical symmetric, PD kernels via one positive measure; symmetric, non-positive definite kernels via signed measures; and asymmetric kernels via complex measures.
Our AsK-RFFs method is empirically validated on several typical large-scale datasets and achieves promising kernel approximation performance.
arXiv Detail & Related papers (2022-09-18T03:39:18Z) - Self-supervised learning with rotation-invariant kernels [4.059849656394191]
We propose a general kernel framework to design a generic regularization loss that promotes the embedding distribution to be close to the uniform distribution on the hypersphere.
Our framework uses rotation-invariant kernels defined on the hypersphere, also known as dot-product kernels.
Our experiments demonstrate that using a truncated rotation-invariant kernel provides competitive results compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-07-28T08:06:24Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Towards Unbiased Random Features with Lower Variance For Stationary
Indefinite Kernels [26.57122949130266]
Our algorithm achieves lower variance and approximation error compared with the existing kernel approximation methods.
With better approximation to the originally selected kernels, improved classification accuracy and regression ability is obtained.
arXiv Detail & Related papers (2021-04-13T13:56:50Z) - A Dynamical Mean-Field Theory for Learning in Restricted Boltzmann
Machines [2.8021833233819486]
We define a message-passing algorithm for computing magnetizations in Boltzmann machines.
We prove the global convergence of the algorithm under a stability criterion and compute convergence rates showing excellent agreement with numerical simulations.
arXiv Detail & Related papers (2020-05-04T15:19:31Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.