Related papers: Tractable Computation of Expected Kernels by Circuits

Tractable Computation of Expected Kernels by Circuits

URL: http://arxiv.org/abs/2102.10562v1
Date: Sun, 21 Feb 2021 08:59:06 GMT
Title: Tractable Computation of Expected Kernels by Circuits
Authors: Wenzhe Li, Zhe Zeng, Antonio Vergari, Guy Van den Broeck
Abstract summary: We describe the conditions under which we can compute expected kernels exactly and efficiently. We then demonstrate possible advancements for kernel embedding frameworks by exploiting tractable expected kernels. We empirically evaluate both algorithms and show that they outperform standard baselines on a variety of datasets.
Score: 35.059091080947205
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Computing the expectation of some kernel function is ubiquitous in machine learning, from the classical theory of support vector machines, to exploiting kernel embeddings of distributions in applications ranging from probabilistic modeling, statistical inference, casual discovery, and deep learning. In all these scenarios, we tend to resort to Monte Carlo estimates as expectations of kernels are intractable in general. In this work, we characterize the conditions under which we can compute expected kernels exactly and efficiently, by leveraging recent advances in probabilistic circuit representations. We first construct a circuit representation for kernels and propose an approach to such tractable computation. We then demonstrate possible advancements for kernel embedding frameworks by exploiting tractable expected kernels to derive new algorithms for two challenging scenarios: 1) reasoning under missing data with kernel support vector regressors; 2) devising a collapsed black-box importance sampling scheme. Finally, we empirically evaluate both algorithms and show that they outperform standard baselines on a variety of datasets.

Related papers

Feature maps for the Laplacian kernel and its generalizations [3.671202973761375]
Unlike the Gaussian kernel, the Laplacian kernel is not separable. We provide random features for the Laplacian kernel and its two generalizations. We demonstrate the efficacy of these random feature maps on real datasets.
arXiv Detail & Related papers (2025-02-21T16:36:20Z)
Heating Up Quasi-Monte Carlo Graph Random Features: A Diffusion Kernel Perspective [0.0]
We build upon a recently introduced class of quasi-graph random features (q-GRFs) We find that the Diffusion kernel performs most similarly to the 2-regularized Laplacian. We explore graph types that benefit from the previously established antithetic termination procedure.
arXiv Detail & Related papers (2024-10-10T21:51:31Z)
Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS) We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises. We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z)
Gaussian Processes on Distributions based on Regularized Optimal Transport [2.905751301655124]
We present a novel kernel over the space of probability measures based on the dual formulation of optimal regularized transport. We prove that this construction enables to obtain a valid kernel, by using the Hilbert norms. We provide theoretical guarantees on the behaviour of a Gaussian process based on this kernel.
arXiv Detail & Related papers (2022-10-12T20:30:23Z)
On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods. We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z)
Meta-Learning Hypothesis Spaces for Sequential Decision-making [79.73213540203389]
We propose to meta-learn a kernel from offline data (Meta-KeL) Under mild conditions, we guarantee that our estimated RKHS yields valid confidence sets. We also empirically evaluate the effectiveness of our approach on a Bayesian optimization task.
arXiv Detail & Related papers (2022-02-01T17:46:51Z)
Kernel Mean Estimation by Marginalized Corrupted Distributions [96.9272743070371]
Estimating the kernel mean in a kernel Hilbert space is a critical component in many kernel learning algorithms. We present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution.
arXiv Detail & Related papers (2021-07-10T15:11:28Z)
Entangled Kernels -- Beyond Separability [10.381276986079865]
We consider the problem of operator-valued kernel learning and investigate the possibility of going beyond the well-known separable kernels. We propose a new view on operator-valued kernels and define a general family of kernels that encompasses previously known operator-valued kernels. Within this framework, we introduce another novel class of operator-valued kernels called entangled kernels that are not separable.
arXiv Detail & Related papers (2021-01-14T09:18:02Z)
Learning Compositional Sparse Gaussian Processes with a Shrinkage Prior [26.52863547394537]
We present a novel probabilistic algorithm to learn a kernel composition by handling the sparsity in the kernel selection with Horseshoe prior. Our model can capture characteristics of time series with significant reductions in computational time and have competitive regression performance on real-world data sets.
arXiv Detail & Related papers (2020-12-21T13:41:15Z)
Kernel learning approaches for summarising and combining posterior similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models. A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z)
Generalized vec trick for fast learning of pairwise kernel models [3.867363075280544]
We present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship between the objects. We show how all the reviewed kernels can be expressed as sums of Kronecker products, allowing the use of generalized vec trick for speeding up their computation.
arXiv Detail & Related papers (2020-09-02T13:27:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.