seMCD: Sequentially implemented Monte Carlo depth computation with statistical guarantees
- URL: http://arxiv.org/abs/2507.06227v1
- Date: Tue, 08 Jul 2025 17:59:07 GMT
- Title: seMCD: Sequentially implemented Monte Carlo depth computation with statistical guarantees
- Authors: Felix Gnettner, Claudia Kirch, Alicia Nieto-Reyes,
- Abstract summary: Statistical depth functions provide center-outward orderings in spaces of dimension larger than one.<n>The numerical evaluation of such depth functions can be computationally prohibitive, even for relatively low dimensions.<n>We present a novel sequentially implemented Monte Carlo methodology for the computation of, theoretical and empirical, depth functions and related quantities.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Statistical depth functions provide center-outward orderings in spaces of dimension larger than one, where a natural ordering does not exist. The numerical evaluation of such depth functions can be computationally prohibitive, even for relatively low dimensions. We present a novel sequentially implemented Monte Carlo methodology for the computation of, theoretical and empirical, depth functions and related quantities (seMCD), that outputs an interval, a so-called seMCD-bucket, to which the quantity of interest belongs with a high probability prespecified by the user. For specific classes of depth functions, we adapt algorithms from sequential testing, providing finite-sample guarantees. For depth functions dependent on unknown distributions, we offer asymptotic guarantees using non-parametric statistical methods. In contrast to plain-vanilla Monte Carlo methodology the number of samples required in the algorithm is random but typically much smaller than standard choices suggested in the literature. The seMCD method can be applied to various depth functions, covering multivariate and functional spaces. We demonstrate the efficiency and reliability of our approach through empirical studies, highlighting its applicability in outlier or anomaly detection, classification, and depth region computation. In conclusion, the seMCD-algorithm can achieve accurate depth approximations with few Monte Carlo samples while maintaining rigorous statistical guarantees.
Related papers
- On Policy Evaluation Algorithms in Distributional Reinforcement Learning [0.0]
We introduce a novel class of algorithms to efficiently approximate the unknown return distributions in policy evaluation problems from distributional reinforcement learning (DRL)
For a plain instance of our proposed class of algorithms we prove error bounds, both within Wasserstein and Kolmogorov--Smirnov distances.
For return distributions having probability density functions the algorithms yield approximations for these densities; error bounds are given within supremum norm.
arXiv Detail & Related papers (2024-07-19T10:06:01Z) - Bayesian Frequency Estimation Under Local Differential Privacy With an Adaptive Randomized Response Mechanism [0.4604003661048266]
We propose AdOBEst-LDP, a new algorithm for adaptive, online Bayesian estimation of categorical distributions under local differential privacy.<n>By adapting the subset selection process to the past privatized data via Bayesian estimation, the algorithm improves the utility of future privatized data.
arXiv Detail & Related papers (2024-05-11T13:59:52Z) - Inference in Randomized Least Squares and PCA via Normality of Quadratic Forms [19.616162116973637]
We develop a unified methodology for statistical inference via randomized sketching or projections.
The methodology applies to fixed datasets -- i.e., is data-conditional -- and the only randomness is due to the randomized algorithm.
arXiv Detail & Related papers (2024-04-01T04:35:44Z) - Boosting the Power of Kernel Two-Sample Tests [4.07125466598411]
A kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces.
We propose a method to boost the power of the kernel test by combining MMD estimates over multiple kernels using their Mahalanobis distance.
arXiv Detail & Related papers (2023-02-21T14:14:30Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.<n>We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.<n>Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z) - The Normalized Cross Density Functional: A Framework to Quantify
Statistical Dependence for Random Processes [6.625320950808605]
We present a novel approach to measuring statistical dependence between two random processes (r.p.) using a positive-definite function called the Normalized Cross Density (NCD)
NCD is derived directly from the probability density functions of two r.p. and constructs a data-dependent Hilbert space, the Normalized Cross-Density Hilbert Space (NCD-HS)
We mathematically prove that FMCA learns the dominant eigenvalues and eigenfunctions of NCD directly from realizations.
arXiv Detail & Related papers (2022-12-09T02:12:41Z) - Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency [111.83670279016599]
We study reinforcement learning for partially observed decision processes (POMDPs) with infinite observation and state spaces.
We make the first attempt at partial observability and function approximation for a class of POMDPs with a linear structure.
arXiv Detail & Related papers (2022-04-20T21:15:38Z) - Uniform-PAC Bounds for Reinforcement Learning with Linear Function
Approximation [92.3161051419884]
We study reinforcement learning with linear function approximation.
Existing algorithms only have high-probability regret and/or Probably Approximately Correct (PAC) sample complexity guarantees.
We propose a new algorithm called FLUTE, which enjoys uniform-PAC convergence to the optimal policy with high probability.
arXiv Detail & Related papers (2021-06-22T08:48:56Z) - Parallel Stochastic Mirror Descent for MDPs [72.75921150912556]
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs)
Some variant of Mirror Descent is proposed for convex programming problems with Lipschitz-continuous functionals.
We analyze this algorithm in a general case and obtain an estimate of the convergence rate that does not accumulate errors during the operation of the method.
arXiv Detail & Related papers (2021-02-27T19:28:39Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z) - Approximate MMAP by Marginal Search [78.50747042819503]
We present a strategy for marginal MAP queries in graphical models.
The proposed confidence measure is properly detecting instances for which the algorithm is accurate.
For sufficiently high confidence levels, the algorithm gives the exact solution or an approximation whose Hamming distance from the exact one is small.
arXiv Detail & Related papers (2020-02-12T07:41:13Z) - Mean shift cluster recognition method implementation in the nested
sampling algorithm [0.0]
Nested sampling is an efficient algorithm for the calculation of the Bayesian evidence and posterior parameter probability distributions.
Here we present a new solution based on the mean shift cluster recognition method implemented in a random walk search algorithm.
arXiv Detail & Related papers (2020-01-31T15:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.