Kernelized Cumulants: Beyond Kernel Mean Embeddings
- URL: http://arxiv.org/abs/2301.12466v2
- Date: Sun, 29 Oct 2023 09:05:52 GMT
- Title: Kernelized Cumulants: Beyond Kernel Mean Embeddings
- Authors: Patric Bonnier, Harald Oberhauser, Zolt\'an Szab\'o
- Abstract summary: We extend cumulants to reproducing kernel Hilbert spaces (RKHS) using tools from tensor algebras.
We argue that going beyond degree one has several advantages and can be achieved with the same computational complexity and minimal overhead.
- Score: 11.448622437140022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In $\mathbb R^d$, it is well-known that cumulants provide an alternative to
moments that can achieve the same goals with numerous benefits such as lower
variance estimators. In this paper we extend cumulants to reproducing kernel
Hilbert spaces (RKHS) using tools from tensor algebras and show that they are
computationally tractable by a kernel trick. These kernelized cumulants provide
a new set of all-purpose statistics; the classical maximum mean discrepancy and
Hilbert-Schmidt independence criterion arise as the degree one objects in our
general construction. We argue both theoretically and empirically (on
synthetic, environmental, and traffic data analysis) that going beyond degree
one has several advantages and can be achieved with the same computational
complexity and minimal overhead in our experiments.
Related papers
- On the Consistency of Kernel Methods with Dependent Observations [5.467140383171385]
We propose a new notion of empirical weak convergence (EWC) explaining such phenomena for kernel methods.
EWC assumes the existence of a random data distribution and is a strict weakening of previous assumptions in the field.
Our results open new classes of processes to statistical learning and can serve as a foundation for a theory of learning beyond i.i.d. and mixing.
arXiv Detail & Related papers (2024-06-10T08:35:01Z) - Tensor cumulants for statistical inference on invariant distributions [49.80012009682584]
We show that PCA becomes computationally hard at a critical value of the signal's magnitude.
We define a new set of objects, which provide an explicit, near-orthogonal basis for invariants of a given degree.
It also lets us analyze a new problem of distinguishing between different ensembles.
arXiv Detail & Related papers (2024-04-29T14:33:24Z) - The Minimax Rate of HSIC Estimation for Translation-Invariant Kernels [0.0]
We prove that the minimax optimal rate of HSIC estimation on $mathbb Rd$ for Borel measures containing the Gaussians with continuous bounded translation-invariant characteristic kernels is $mathcal O!left(n-1/2right)$.
arXiv Detail & Related papers (2024-03-12T15:13:21Z) - Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting [64.0722630873758]
We consider rather general and broad class of Markov chains, Ito chains, that look like Euler-Maryama discretization of some Differential Equation.
We prove the bound in $W_2$-distance between the laws of our Ito chain and differential equation.
arXiv Detail & Related papers (2023-10-09T18:38:56Z) - Higher-order topological kernels via quantum computation [68.8204255655161]
Topological data analysis (TDA) has emerged as a powerful tool for extracting meaningful insights from complex data.
We propose a quantum approach to defining Betti kernels, which is based on constructing Betti curves with increasing order.
arXiv Detail & Related papers (2023-07-14T14:48:52Z) - Nystr\"om $M$-Hilbert-Schmidt Independence Criterion [0.0]
Key features that make kernels ubiquitous are (i) the number of domains they have been designed for, (ii) the Hilbert structure of the function class associated to kernels, and (iii) their ability to represent probability distributions without loss of information.
We propose an alternative Nystr"om-based HSIC estimator which handles the $Mge 2$ case, prove its consistency, and demonstrate its applicability.
arXiv Detail & Related papers (2023-02-20T11:51:58Z) - Interpolation with the polynomial kernels [5.8720142291102135]
kernels are widely used in machine learning and they are one of the default choices to develop kernel-based regression models.
They are rarely used and considered in numerical analysis due to their lack of strict positive definiteness.
This paper is devoted to establish some initial results for the study of these kernels, and their related algorithms.
arXiv Detail & Related papers (2022-12-15T08:30:23Z) - Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method.
It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate.
We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z) - Revisiting Memory Efficient Kernel Approximation: An Indefinite Learning
Perspective [0.8594140167290097]
Matrix approximations are a key element in large-scale machine learning approaches.
We extend MEKA to be applicable not only for shift-invariant kernels but also for non-stationary kernels.
We present a Lanczos-based estimation of a spectrum shift to develop a stable positive semi-definite MEKA approximation.
arXiv Detail & Related papers (2021-12-18T10:01:34Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.