Improved Sample Complexity for Full Coverage in Compact and Continuous Spaces
- URL: http://arxiv.org/abs/2511.17784v1
- Date: Fri, 21 Nov 2025 21:06:14 GMT
- Title: Improved Sample Complexity for Full Coverage in Compact and Continuous Spaces
- Authors: Lyu Yuhuan,
- Abstract summary: We study uniform random sampling on the $d$-dimensional unit hypercube.<n>We derive a sample complexity bound with a logarithmic dependence on the failure probability.<n>Our findings offer a sharper theoretical tool for algorithms that rely on grid-based coverage guarantees.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Verifying uniform conditions over continuous spaces through random sampling is fundamental in machine learning and control theory, yet classical coverage analyses often yield conservative bounds, particularly at small failure probabilities. We study uniform random sampling on the $d$-dimensional unit hypercube and analyze the number of uncovered subcubes after discretization. By applying a concentration inequality to the uncovered-count statistic, we derive a sample complexity bound with a logarithmic dependence on the failure probability ($δ$), i.e., $M =O( \tilde{C}\ln(\frac{2\tilde{C}}δ))$, which contrasts sharply with the classical linear $1/δ$ dependence. Under standard Lipschitz and uniformity assumptions, we present a self-contained derivation and compare our result with classical coupon-collector rates. Numerical studies across dimensions, precision levels, and confidence targets indicate that our bound tracks practical coverage requirements more tightly and scales favorably as $δ\to 0$. Our findings offer a sharper theoretical tool for algorithms that rely on grid-based coverage guarantees, enabling more efficient sampling, especially in high-confidence regimes.
Related papers
- Active Bipartite Ranking with Smooth Posterior Distributions [1.9838140219494644]
Bipartite ranking is a statistical learning problem involved in many applications and widely studied in the passive context.<n>We propose a novel algorithm, referred to as smooth-rank and designed for the continuous setting, which aims to minimise the distance between the ROC curve of the estimated ranking rule and the optimal one w.r.t. the $sup$ norm.<n>We provide a problem dependent upper bound on the expected sampling time of smooth-rank and establish a problem dependent lower bound on the expected sampling time of any PAC$(,)$ algorithm.
arXiv Detail & Related papers (2026-02-27T18:32:08Z) - Efficient Sampling with Discrete Diffusion Models: Sharp and Adaptive Guarantees [9.180350432640912]
We study the sampling efficiency of score-based discrete diffusion models under a continuous-time Markov chain (CTMC) formulation.<n>For uniform discrete diffusion, we show that the $$-leaping algorithm achieves an complexity of order $tilde O(d/varepsilon)$.<n>For masking discrete diffusion, we introduce a modified $$-leaping sampler whose convergence rate is governed by an intrinsic information-theoretic quantity.
arXiv Detail & Related papers (2026-02-16T18:48:17Z) - Harmful Overfitting in Sobolev Spaces [49.47221415754556]
We study the generalization behavior of functions in Sobolev spaces $Wk, p(mathbbRd)$ that perfectly fit a noisy training data set.<n>We show that approximately norm-minimizing interpolators, which are canonical solutions selected by smoothness bias, exhibit harmful overfitting.<n>Our proof uses a geometric argument which identifies harmful neighborhoods of the training data using Sobolev inequalities.
arXiv Detail & Related papers (2026-01-31T17:40:56Z) - Convergence, design and training of continuous-time dropout as a random batch method [0.0]
We study dropout regularization in continuous-time models through the lens of random-batch methods.<n>We construct an unbiased, well-posed estimator that mimics dropout by sampling neuron batches over time intervals of length $h$.<n>We then specialize in a single-layer neural ODE and validate the theory on classification and flow matching.
arXiv Detail & Related papers (2025-10-15T04:19:01Z) - Assessing One-Dimensional Cluster Stability by Extreme-Point Trimming [0.0]
We develop a probabilistic method for assessing the tail behavior and geometric stability of one-dimensional i.i.d. samples.<n>We derive analytical expressions, including finite-sample corrections, for the expected shrinkage under both the uniform and Gaussian hypotheses.<n>We further integrate our criterion into a clustering pipeline (e.g. DBSCAN), demonstrating its ability to validate one-dimensional clusters without any density estimation or parameter tuning.
arXiv Detail & Related papers (2025-08-29T21:52:15Z) - Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional dependencies for general score-mismatched diffusion samplers.<n>We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.<n>This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Universal generalization guarantees for Wasserstein distributionally robust models [10.036727981085223]
Distributionally robust optimization has emerged as an attractive way to train robust machine learning models.<n>Recent statistical analyses have proved that generalization guarantees of robust models based on the Wasserstein distance have generalization guarantees that do not suffer from the curse of dimensionality.<n>We establish exact generalization guarantees that cover a wide range of cases, with arbitrary transport costs and parametric loss functions.
arXiv Detail & Related papers (2024-02-19T09:27:47Z) - Convergence for score-based generative modeling with polynomial
complexity [9.953088581242845]
We prove the first convergence guarantees for the core mechanic behind Score-based generative modeling.
Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality.
We show that a predictor-corrector gives better convergence than using either portion alone.
arXiv Detail & Related papers (2022-06-13T14:57:35Z) - Sharper Rates and Flexible Framework for Nonconvex SGD with Client and
Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions.
We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism.
We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z) - Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method.
It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate.
We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z) - Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.