When fractional quasi p-norms concentrate
- URL: http://arxiv.org/abs/2505.19635v1
- Date: Mon, 26 May 2025 07:53:51 GMT
- Title: When fractional quasi p-norms concentrate
- Authors: Ivan Y. Tyukin, Bogdan Grechuk, Evgeny M. Mirkes, Alexander N. Gorban,
- Abstract summary: Concentration of distances in high dimension is an important factor for the development and design of stable and reliable data analysis algorithms.<n>We identify conditions when fractional quasi $p$-norms concentrate and when they don't.
- Score: 44.83880683386964
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Concentration of distances in high dimension is an important factor for the development and design of stable and reliable data analysis algorithms. In this paper, we address the fundamental long-standing question about the concentration of distances in high dimension for fractional quasi $p$-norms, $p\in(0,1)$. The topic has been at the centre of various theoretical and empirical controversies. Here we, for the first time, identify conditions when fractional quasi $p$-norms concentrate and when they don't. We show that contrary to some earlier suggestions, for broad classes of distributions, fractional quasi $p$-norms admit exponential and uniform in $p$ concentration bounds. For these distributions, the results effectively rule out previously proposed approaches to alleviate concentration by "optimal" setting the values of $p$ in $(0,1)$. At the same time, we specify conditions and the corresponding families of distributions for which one can still control concentration rates by appropriate choices of $p$. We also show that in an arbitrarily small vicinity of a distribution from a large class of distributions for which uniform concentration occurs, there are uncountably many other distributions featuring anti-concentration properties. Importantly, this behavior enables devising relevant data encoding or representation schemes favouring or discouraging distance concentration. The results shed new light on this long-standing problem and resolve the tension around the topic in both theory and empirical evidence reported in the literature.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional dependencies for general score-mismatched diffusion samplers.<n>We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.<n>This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Flow matching achieves almost minimax optimal convergence [50.38891696297888]
Flow matching (FM) has gained significant attention as a simulation-free generative model.
This paper discusses the convergence properties of FM for large sample size under the $p$-Wasserstein distance.
We establish that FM can achieve an almost minimax optimal convergence rate for $1 leq p leq 2$, presenting the first theoretical evidence that FM can reach convergence rates comparable to those of diffusion models.
arXiv Detail & Related papers (2024-05-31T14:54:51Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - Only Tails Matter: Average-Case Universality and Robustness in the
Convex Regime [34.34747805050984]
We show that the concentration of eigenvalues near the edges of the expected spectral distribution determines a problem's average complexity.
We show that in the average-case context, Nesterov's method is universally nearly optimalally.
arXiv Detail & Related papers (2022-06-20T17:16:52Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Uniform Concentration Bounds toward a Unified Framework for Robust
Clustering [21.789311405437573]
Recent advances in center-based clustering continue to improve upon the drawbacks of Lloyd's celebrated $k$-means algorithm.
Various methods seek to address poor local minima, sensitivity to outliers, and data that are not well-suited to Euclidean measures of fit.
This paper proposes a cohesive robust framework for center-based clustering under a general class of dissimilarity measures.
arXiv Detail & Related papers (2021-10-27T03:43:44Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Improved Estimation of Concentration Under $\ell_p$-Norm Distance
Metrics Using Half Spaces [14.947511752748005]
Concentration of measure has been argued to be the fundamental cause of adversarial vulnerability.
We propose a method to estimate the concentration of any empirical dataset under $ell_p$-norm distance metrics.
Our proposed algorithm is more efficient than Mahloujifar et al.'s, and our experiments on synthetic datasets and image benchmarks demonstrate that it is able to find much tighter intrinsic robustness bounds.
arXiv Detail & Related papers (2021-03-24T01:16:28Z) - Concentration Inequalities for Statistical Inference [3.10770247120758]
This paper gives a review of concentration inequalities which are widely employed in non-asymptotical analyses of mathematical statistics.<n>We aim to illustrate the concentration inequalities with known constants and to improve existing bounds with sharper constants.
arXiv Detail & Related papers (2020-11-04T12:54:06Z) - Concentration of solutions to random equations with concentration of
measure hypotheses [45.24358490877106]
We study the concentration of random objects that are implicitly formulated as fixed points to equations $Y = f(X)$ where $f$ is a random mapping.
arXiv Detail & Related papers (2020-10-19T21:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.