Scalable and consistent embedding of probability measures into Hilbert spaces via measure quantization
- URL: http://arxiv.org/abs/2502.04907v2
- Date: Tue, 11 Feb 2025 15:17:43 GMT
- Title: Scalable and consistent embedding of probability measures into Hilbert spaces via measure quantization
- Authors: Erell Gachon, Elsa Cazelles, Jérémie Bigot,
- Abstract summary: We study two methods based on measure quantization for approximating input probability measures with discrete measures of small-support size.
We study the consistency of such approximations, and its implication for scalable embeddings of probability measures into a Hilbert space at a low computational cost.
- Score: 1.6385815610837167
- License:
- Abstract: This paper is focused on statistical learning from data that come as probability measures. In this setting, popular approaches consist in embedding such data into a Hilbert space with either Linearized Optimal Transport or Kernel Mean Embedding. However, the cost of computing such embeddings prohibits their direct use in large-scale settings. We study two methods based on measure quantization for approximating input probability measures with discrete measures of small-support size. The first one is based on optimal quantization of each input measure, while the second one relies on mean-measure quantization. We study the consistency of such approximations, and its implication for scalable embeddings of probability measures into a Hilbert space at a low computational cost. We finally illustrate our findings with various numerical experiments.
Related papers
- Enhanced observable estimation through classical optimization of
informationally over-complete measurement data -- beyond classical shadows [0.0]
We propose a method to optimize the dual POVM operators after the measurements have been carried out.
We show that it can significantly reduce statistical errors with respect to canonical duals on multiple observable estimations.
arXiv Detail & Related papers (2024-01-31T18:13:42Z) - Tight conic approximation of testing regions for quantum statistical
models and measurements [5.801621787540268]
We provide an implicit outer approximation of the testing region of any given quantum statistical model or measurement.
We also apply our approximation formulas to characterize the ability to transform one quantum statistical model or measurement into another.
arXiv Detail & Related papers (2023-09-28T04:02:55Z) - High Dimensional Statistical Estimation under One-bit Quantization [27.718986773043643]
One-bit (binary) data are preferable in many applications because of the efficiency in signal storage, processing, transmission, and enhancement of privacy.
In this paper, we study three fundamental statistical estimation problems.
Under both sub-Gaussian and heavy-tailed regimes, new estimators that handle high-dimensional scaling are proposed.
arXiv Detail & Related papers (2022-02-26T15:13:04Z) - Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method.
It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate.
We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - Direct estimation of quantum coherence by collective measurements [54.97898890263183]
We introduce a collective measurement scheme for estimating the amount of coherence in quantum states.
Our scheme outperforms other estimation methods based on tomography or adaptive measurements.
We show that our method is accessible with today's technology by implementing it experimentally with photons.
arXiv Detail & Related papers (2020-01-06T03:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.