Nystr\"om $M$-Hilbert-Schmidt Independence Criterion
- URL: http://arxiv.org/abs/2302.09930v2
- Date: Mon, 31 Jul 2023 21:20:16 GMT
- Title: Nystr\"om $M$-Hilbert-Schmidt Independence Criterion
- Authors: Florian Kalinke and Zolt\'an Szab\'o
- Abstract summary: Key features that make kernels ubiquitous are (i) the number of domains they have been designed for, (ii) the Hilbert structure of the function class associated to kernels, and (iii) their ability to represent probability distributions without loss of information.
We propose an alternative Nystr"om-based HSIC estimator which handles the $Mge 2$ case, prove its consistency, and demonstrate its applicability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Kernel techniques are among the most popular and powerful approaches of data
science. Among the key features that make kernels ubiquitous are (i) the number
of domains they have been designed for, (ii) the Hilbert structure of the
function class associated to kernels facilitating their statistical analysis,
and (iii) their ability to represent probability distributions without loss of
information. These properties give rise to the immense success of
Hilbert-Schmidt independence criterion (HSIC) which is able to capture joint
independence of random variables under mild conditions, and permits closed-form
estimators with quadratic computational complexity (w.r.t. the sample size). In
order to alleviate the quadratic computational bottleneck in large-scale
applications, multiple HSIC approximations have been proposed, however these
estimators are restricted to $M=2$ random variables, do not extend naturally to
the $M\ge 2$ case, and lack theoretical guarantees. In this work, we propose an
alternative Nystr\"om-based HSIC estimator which handles the $M\ge 2$ case,
prove its consistency, and demonstrate its applicability in multiple contexts,
including synthetic examples, dependency testing of media annotations, and
causal discovery.
Related papers
- Nyström Kernel Stein Discrepancy [4.551160285910023]
We propose a Nystr"om-based KSD acceleration -- with runtime $mathcal Oleft(mn+m3right)$ for $n$ samples and $mll n$ Nystr"om points.
We show its $sqrtn$-consistency with a classical sub-Gaussian assumption, and demonstrate its applicability for goodness-of-fit testing on a suite of benchmarks.
arXiv Detail & Related papers (2024-06-12T16:50:12Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - The Minimax Rate of HSIC Estimation for Translation-Invariant Kernels [0.0]
We prove that the minimax optimal rate of HSIC estimation on $mathbb Rd$ for Borel measures containing the Gaussians with continuous bounded translation-invariant characteristic kernels is $mathcal O!left(n-1/2right)$.
arXiv Detail & Related papers (2024-03-12T15:13:21Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Kernelized Cumulants: Beyond Kernel Mean Embeddings [11.448622437140022]
We extend cumulants to reproducing kernel Hilbert spaces (RKHS) using tools from tensor algebras.
We argue that going beyond degree one has several advantages and can be achieved with the same computational complexity and minimal overhead.
arXiv Detail & Related papers (2023-01-29T15:31:06Z) - A Permutation-Free Kernel Independence Test [36.50719125230106]
In nonparametric independence testing, we observe i.i.d. data $(X_i,Y_i)_i=1n$, where $X in mathcalX, Y in mathcalY$ lie in any general spaces.
Modern test statistics such as the kernel Hilbert-Schmidt Independence Criterion (HSIC) and Distance Covariance (dCov) have intractable null distributions due to the degeneracy of the underlying U-statistics.
This paper provides a simple but nontrivial modification of HSIC
arXiv Detail & Related papers (2022-12-18T15:28:16Z) - Nonparametric extensions of randomized response for private confidence sets [51.75485869914048]
This work derives methods for performing nonparametric, nonasymptotic statistical inference for population means under the constraint of local differential privacy (LDP)
We present confidence intervals (CI) and time-uniform confidence sequences (CS) for $mustar$ when only given access to the privatized data.
arXiv Detail & Related papers (2022-02-17T16:04:49Z) - Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method.
It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate.
We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Nonparametric approximation of conditional expectation operators [0.3655021726150368]
We investigate the approximation of the $L2$-operator defined by $[Pf](x) := mathbbE[ f(Y) mid X = x ]$ under minimal assumptions.
We prove that $P$ can be arbitrarily well approximated in operator norm by Hilbert-Schmidt operators acting on a reproducing kernel space.
arXiv Detail & Related papers (2020-12-23T19:06:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.