A Faster Sampler for Discrete Determinantal Point Processes
- URL: http://arxiv.org/abs/2210.17358v1
- Date: Mon, 31 Oct 2022 14:37:54 GMT
- Title: A Faster Sampler for Discrete Determinantal Point Processes
- Authors: Simon Barthelm\'e, Nicolas Tremblay and Pierre-Olivier Amblard
- Abstract summary: Discrete Determinantal Point Processes (DPPs) have a wide array of potential applications for subsampling datasets.
In the worst-case scenario, the sampling cost scales as $O(n3)$ where n is the number of elements of the ground set.
A popular workaround to this prohibitive cost is to sample DPPs defined by low-rank kernels.
- Score: 10.355938901584567
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discrete Determinantal Point Processes (DPPs) have a wide array of potential
applications for subsampling datasets. They are however held back in some cases
by the high cost of sampling. In the worst-case scenario, the sampling cost
scales as $O(n^3)$ where n is the number of elements of the ground set. A
popular workaround to this prohibitive cost is to sample DPPs defined by
low-rank kernels. In such cases, the cost of standard sampling algorithms
scales as $O(np^2 + nm^2)$ where m is the (average) number of samples of the
DPP (usually $m \ll n$) and p ($m \leq p \leq n$) the rank of the kernel used
to define the DPP. The first term, $O(np^2)$, comes from a SVD-like step. We
focus here on the second term of this cost, $O(nm^2)$, and show that it can be
brought down to $O(nm + m^3 log m)$ without loss on the sampling's exactness.
In practice, we observe extremely substantial speedups compared to the
classical algorithm as soon as $n > 1, 000$. The algorithm described here is a
close variant of the standard algorithm for sampling continuous DPPs, and uses
rejection sampling. In the specific case of projection DPPs, we also show that
any additional sample can be drawn in time $O(m^3 log m)$. Finally, an
interesting by-product of the analysis is that a realisation from a DPP is
typically contained in a subset of size $O(m log m)$ formed using leverage
score i.i.d. sampling.
Related papers
- Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan [49.1574468325115]
We revisit the sample and computational complexity of completing a rank-1 tensor in $otimes_i=1N mathbbRd$.
We present a characterization of the problem which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems.
arXiv Detail & Related papers (2024-08-10T04:26:19Z) - Quantum (Inspired) $D^2$-sampling with Applications [0.138120109831448]
We show a quantum implementation of $k$-means++ that runs in time $tildeO(zeta2 k2)$.
It can be shown through a robust approximation analysis of $k$-means++ that the quantum version preserves its $O(logk)$ approximation guarantee.
This results in a fast quantum-inspired classical implementation of $k$-means++, which we call QI-$k$-means++, with a running time $O(Nd) + tildeO(zeta
arXiv Detail & Related papers (2024-05-22T05:13:28Z) - Weighted least-squares approximation with determinantal point processes and generalized volume sampling [33.33724208084121]
We consider the problem of approximating a function from $L2$ by an element of a given $m$-dimensional space $V_m$.
We show that the approximation is almost surely bounded by the best approximation error measured in the $H$-norm.
arXiv Detail & Related papers (2023-12-21T17:34:18Z) - On sampling determinantal and Pfaffian point processes on a quantum
computer [49.1574468325115]
DPPs were introduced by Macchi as a model in quantum optics the 1970s.
Most applications require sampling from a DPP, and given their quantum origin, it is natural to wonder whether sampling a DPP on a classical computer is easier than on a classical one.
Vanilla sampling consists in two steps, of respective costs $mathcalO(N3)$ and $mathcalO(Nr2)$ operations on a classical computer, where $r$ is the rank of the kernel matrix.
arXiv Detail & Related papers (2023-05-25T08:43:11Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes [45.40646054226403]
A determinantal point process (DPP) assigns a probability to every subset of $n$ items.
Recent work has studied an approximate chain Monte Carlo sampling algorithm for NDPPs restricted to size-$k$ subsets.
We develop a scalable MCMC sampling algorithm for $k$-NDPPs with low-rank kernels.
arXiv Detail & Related papers (2022-07-01T15:22:12Z) - Sampling from a $k$-DPP without looking at all items [58.30573872035083]
Given a kernel function and a subset size $k$, our goal is to sample $k$ out of $n$ items with probability proportional to the determinant of the kernel matrix induced by the subset (a.k.a. $k$-DPP)
Existing $k$-DPP sampling algorithms require an expensive preprocessing step which involves multiple passes over all $n$ items, making it infeasible for large datasets.
We develop an algorithm which adaptively builds a sufficiently large uniform sample of data that is then used to efficiently generate a smaller set of $k$ items.
arXiv Detail & Related papers (2020-06-30T16:40:44Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.