Related papers: Kernel Embeddings and the Separation of Measure Phenomenon

Kernel Embeddings and the Separation of Measure Phenomenon

URL: http://arxiv.org/abs/2505.04613v2
Date: Mon, 15 Sep 2025 09:35:15 GMT
Title: Kernel Embeddings and the Separation of Measure Phenomenon
Authors: Leonardo V. Santoro, Kartik G. Waghmare, Victor M. Panaretos,
Abstract summary: We prove that kernel covariance embeddings lead to information-theoretically perfect separation of distinct probability distributions.<n>This phenomenon appears to be a blessing of infinite dimensionality, by means of embedding, with the potential to inform the design of efficient inference tools.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We prove that kernel covariance embeddings lead to information-theoretically perfect separation of distinct probability distributions. In statistical terms, we establish that testing for the equality of two probability measures on a compact and separable metric space is equivalent to testing for the singularity between two centered Gaussian measures on a reproducing kernel Hilbert Space. The corresponding Gaussians are defined via the notion of kernel covariance embedding of a probability measure, and the Hilbert space is that generated by the embedding kernel. Distinguishing singular Gaussians is fundamentally simpler from an information-theoretic perspective than non-parametric two-sample testing, particularly in complex or high-dimensional domains. This is because singular Gaussians are supported on essentially separate and affine subspaces. Our proof leverages the classical Feldman-Hajek dichotomy, and shows that even a small perturbation of a distribution will be maximally magnified through its Gaussian embedding. This ``separation of measure phenomenon'' appears to be a blessing of infinite dimensionality, by means of embedding, with the potential to inform the design of efficient inference tools in considerable generality. The elicitation of this phenomenon also appears to crystallize, in a precise and simple mathematical statement, the outstanding empirical effectiveness of the so-called ``kernel trick".

Related papers

Likelihood Ratio Tests by Kernel Gaussian Embedding [0.0]
We propose a novel kernel-based nonparametric two-sample test, employing the combined use of kernel mean and kernel covariance embedding.<n>Our test builds on recent results showing how such combined embeddings map distinct probability measures to mutually singular Gaussian measures on the kernel's RKHS.<n>We construct a test statistic based on the relative entropy between the Gaussian embeddings, in effect the likelihood ratio.
arXiv Detail & Related papers (2025-08-11T13:41:38Z)
GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction [55.60972844777044]
3D semantic occupancy prediction is an important task for robust vision-centric autonomous driving.<n>Most existing methods leverage dense grid-based scene representations, overlooking the spatial sparsity of the driving scenes.<n>We propose a probabilistic Gaussian superposition model which interprets each Gaussian as a probability distribution of its neighborhood being occupied.
arXiv Detail & Related papers (2024-12-05T17:59:58Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Sampling and estimation on manifolds using the Langevin diffusion [45.57801520690309]
Two estimators of linear functionals of $mu_phi $ based on the discretized Markov process are considered.<n>Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion.
arXiv Detail & Related papers (2023-12-22T18:01:11Z)
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces [43.877478563933316]
In to symmetries is one of the most fundamental forms of prior information one can consider. In this work, we develop constructive and practical techniques for building stationary Gaussian processes on a very large class of non-Euclidean spaces.
arXiv Detail & Related papers (2023-01-30T17:27:12Z)
Dilute neutron star matter from neural-network quantum states [58.720142291102135]
Low-density neutron matter is characterized by the formation of Cooper pairs and the onset of superfluidity. We model this density regime by capitalizing on the expressivity of the hidden-nucleon neural-network quantum states combined with variational Monte Carlo and reconfiguration techniques.
arXiv Detail & Related papers (2022-12-08T17:55:25Z)
Gaussian Processes on Distributions based on Regularized Optimal Transport [2.905751301655124]
We present a novel kernel over the space of probability measures based on the dual formulation of optimal regularized transport. We prove that this construction enables to obtain a valid kernel, by using the Hilbert norms. We provide theoretical guarantees on the behaviour of a Gaussian process based on this kernel.
arXiv Detail & Related papers (2022-10-12T20:30:23Z)
Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.<n>In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)<n>For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z)
Page curves and typical entanglement in linear optics [0.0]
We study entanglement within a set of squeezed modes that have been evolved by a random linear optical unitary. We prove various results on the typicality of entanglement as measured by the R'enyi-2 entropy. Our main make use of a symmetry property obeyed by the average and the variance of the entropy that dramatically simplifies the averaging over unitaries.
arXiv Detail & Related papers (2022-09-14T18:00:03Z)
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces I: the compact case [43.877478563933316]
In to symmetries is one of the most fundamental forms of prior information one can consider. In this work, we develop constructive and practical techniques for building stationary Gaussian processes on a very large class of non-Euclidean spaces.
arXiv Detail & Related papers (2022-08-31T16:40:40Z)
Kullback-Leibler and Renyi divergences in reproducing kernel Hilbert space and Gaussian process settings [0.0]
We present formulations for regularized Kullback-Leibler and R'enyi divergences via the Alpha Log-Determinant (Log-Det) divergences. For characteristic kernels, the first setting leads to divergences between arbitrary Borel probability measures on a complete, separable metric space. We show that the Alpha Log-Det divergences are continuous in the Hilbert-Schmidt norm, which enables us to apply laws of large numbers for Hilbert space-valued random variables.
arXiv Detail & Related papers (2022-07-18T06:40:46Z)
On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods. We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z)
Pure Exploration in Kernel and Neural Bandits [90.23165420559664]
We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms. To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space.
arXiv Detail & Related papers (2021-06-22T19:51:59Z)
A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense. We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z)
Spectral clustering under degree heterogeneity: a case for the random walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree. In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z)
Convergence and finite sample approximations of entropic regularized Wasserstein distances in Gaussian and RKHS settings [0.0]
We study the convergence and finite sample approximations of entropic regularized Wasserstein distances in the Hilbert space setting. For Gaussian measures on an infinite-dimensional Hilbert space, convergence in the 2-Sinkhorn divergence is weaker than convergence in the exact 2-Wasserstein distance.
arXiv Detail & Related papers (2021-01-05T09:46:58Z)
Entanglement and Complexity of Purification in (1+1)-dimensional free Conformal Field Theories [55.53519491066413]
We find pure states in an enlarged Hilbert space that encode the mixed state of a quantum field theory as a partial trace. We analyze these quantities for two intervals in the vacuum of free bosonic and Ising conformal field theories.
arXiv Detail & Related papers (2020-09-24T18:00:13Z)
Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function. We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z)
Schoenberg-Rao distances: Entropy-based and geometry-aware statistical Hilbert distances [12.729120803225065]
We study a class of statistical Hilbert distances that we term the Schoenberg-Rao distances. We derive novel closed-form distances between mixtures of Gaussian distributions. Our method constitutes a practical alternative to Wasserstein distances and we illustrate its efficiency on a broad range of machine learning tasks.
arXiv Detail & Related papers (2020-02-19T18:48:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.