Geometric Interpretation of Running Nystr\"{o}m-Based Kernel Machines
and Error Analysis
- URL: http://arxiv.org/abs/2002.08937v2
- Date: Sat, 18 Sep 2021 16:34:52 GMT
- Title: Geometric Interpretation of Running Nystr\"{o}m-Based Kernel Machines
and Error Analysis
- Authors: Weida Li, Mingxia Liu, Daoqiang Zhang
- Abstract summary: We develop a new approach with a clear geometric interpretation for running Nystr"om-based kernel machines.
We show that the other two well-studied approaches can be equivalently transformed to be our proposed one.
- Score: 35.01395939823442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, Nystr\"{o}m method has proved its prominence empirically and
theoretically in speeding up the training of kernel machines while retaining
satisfactory performances and accuracy. So far, there are several different
approaches proposed to exploit Nystr\"{o}m method in scaling up kernel
machines. However, there is no comparative study over these approaches, and
they were individually analyzed for specific types of kernel machines.
Therefore, it remains a question that the philosophy of which approach is more
promising when it extends to other kernel machines. In this work, motivated by
the column inclusion property of Gram matrices, we develop a new approach with
a clear geometric interpretation for running Nystr\"{o}m-based kernel machines.
We show that the other two well-studied approaches can be equivalently
transformed to be our proposed one. Consequently, analysis established for the
proposed approach also works for these two. Particularly, our proposed approach
makes it possible to develop approximation errors in a general setting.
Besides, our analysis also manifests the relations among the aforementioned two
approaches and another naive one. First, the analytical forms of the
corresponding approximate solutions are only at odds with one term. Second, the
naive approach can be implemented efficiently by sharing the same training
procedure with others. These analytical results lead to the conjecture that the
naive approach can provide more accurate approximate solutions than the other
two sophisticated approaches. Since our analysis also offers ways for computing
the accuracy of these approximate solutions, we run experiments with
classification tasks to confirm our conjecture.
Related papers
- Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Isotropic Gaussian Processes on Finite Spaces of Graphs [71.26737403006778]
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs.
We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon.
Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
arXiv Detail & Related papers (2022-11-03T10:18:17Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Gaussian Processes and Statistical Decision-making in Non-Euclidean
Spaces [96.53463532832939]
We develop techniques for broadening the applicability of Gaussian processes.
We introduce a wide class of efficient approximations built from this viewpoint.
We develop a collection of Gaussian process models over non-Euclidean spaces.
arXiv Detail & Related papers (2022-02-22T01:42:57Z) - Linear Time Kernel Matrix Approximation via Hyperspherical Harmonics [3.24890820102255]
We propose a new technique for constructing low-rank approximations of matrices that arise in kernel methods for machine learning.
Our approach pairs a novel automatically constructed analytic expansion of the underlying kernel function with a data-dependent compression step to further optimize the approximation.
Experimental results show our approach compares favorably to the commonly used Nystrom method with respect to both accuracy for a given rank and computational time for a given accuracy across a variety of kernels, dimensions, and datasets.
arXiv Detail & Related papers (2022-02-08T05:19:39Z) - From Canonical Correlation Analysis to Self-supervised Graph Neural
Networks [99.44881722969046]
We introduce a conceptually simple yet effective model for self-supervised representation learning with graph data.
We optimize an innovative feature-level objective inspired by classical Canonical Correlation Analysis.
Our method performs competitively on seven public graph datasets.
arXiv Detail & Related papers (2021-06-23T15:55:47Z) - A Joint introduction to Gaussian Processes and Relevance Vector Machines
with Connections to Kalman filtering and other Kernel Smoothers [5.035807711584951]
This article introduces and discusses two methods which straddle the areas of probabilistic Bayesian schemes and kernel methods for regression.
We provide understanding of the mathematical concepts behind these models, and highlight the relationship to other methods.
This is the most in-depth study of its kind to date focused on these two methods, and will be relevant to theoretical understanding and practitioners throughout the domains of data-science, signal processing, machine learning, and artificial intelligence in general.
arXiv Detail & Related papers (2020-09-19T12:22:41Z) - A Perturbation-Based Kernel Approximation Framework [0.0]
We derive a perturbation-based kernel approximation framework based on classical perturbation theory.
We show that our framework gives rise to new kernel approximation schemes, that can be tuned to take advantage of the structure of the approximated kernel matrix.
arXiv Detail & Related papers (2020-09-07T09:14:18Z) - Generalized vec trick for fast learning of pairwise kernel models [3.867363075280544]
We present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship between the objects.
We show how all the reviewed kernels can be expressed as sums of Kronecker products, allowing the use of generalized vec trick for speeding up their computation.
arXiv Detail & Related papers (2020-09-02T13:27:51Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.