Parallel Sampling via Counting
- URL: http://arxiv.org/abs/2408.09442v1
- Date: Sun, 18 Aug 2024 11:42:54 GMT
- Title: Parallel Sampling via Counting
- Authors: Nima Anari, Ruiquan Gao, Aviad Rubinstein,
- Abstract summary: We show how to use parallelization to speed up sampling from an arbitrary distribution.
Our algorithm takes $O(n2/3cdot operatornamepolylog(n,q))$ parallel time.
Our results have implications for sampling in autoregressive models.
- Score: 3.6854430278041264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show how to use parallelization to speed up sampling from an arbitrary distribution $\mu$ on a product space $[q]^n$, given oracle access to counting queries: $\mathbb{P}_{X\sim \mu}[X_S=\sigma_S]$ for any $S\subseteq [n]$ and $\sigma_S \in [q]^S$. Our algorithm takes $O({n^{2/3}\cdot \operatorname{polylog}(n,q)})$ parallel time, to the best of our knowledge, the first sublinear in $n$ runtime for arbitrary distributions. Our results have implications for sampling in autoregressive models. Our algorithm directly works with an equivalent oracle that answers conditional marginal queries $\mathbb{P}_{X\sim \mu}[X_i=\sigma_i\;\vert\; X_S=\sigma_S]$, whose role is played by a trained neural network in autoregressive models. This suggests a roughly $n^{1/3}$-factor speedup is possible for sampling in any-order autoregressive models. We complement our positive result by showing a lower bound of $\widetilde{\Omega}(n^{1/3})$ for the runtime of any parallel sampling algorithm making at most $\operatorname{poly}(n)$ queries to the counting oracle, even for $q=2$.
Related papers
- Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk [12.842909157175582]
We consider the problem of sampling from a $d$-dimensional log-concave distribution $pi(theta) propto exp(-f(theta))$ for $L$-Lipschitz $f$.
We propose a emphrobust sampling framework that computes spectral approximations to the Hessian of the barrier functions in each iteration.
arXiv Detail & Related papers (2024-10-08T05:32:51Z) - Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan [49.1574468325115]
We revisit the sample and computational complexity of completing a rank-1 tensor in $otimes_i=1N mathbbRd$.
We present a characterization of the problem which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems.
arXiv Detail & Related papers (2024-08-10T04:26:19Z) - Outlier Robust Multivariate Polynomial Regression [27.03423421704806]
We are given a set of random samples $(mathbfx_i,y_i) in [-1,1]n times mathbbR$ that are noisy versions of $(mathbfx_i,p(mathbfx_i)$.
The goal is to output a $hatp$, within an $ell_in$-distance of at most $O(sigma)$ from $p$.
arXiv Detail & Related papers (2024-03-14T15:04:45Z) - Identification of Mixtures of Discrete Product Distributions in
Near-Optimal Sample and Time Complexity [6.812247730094931]
We show, for any $ngeq 2k-1$, how to achieve sample complexity and run-time complexity $(1/zeta)O(k)$.
We also extend the known lower bound of $eOmega(k)$ to match our upper bound across a broad range of $zeta$.
arXiv Detail & Related papers (2023-09-25T09:50:15Z) - Reward-Mixing MDPs with a Few Latent Contexts are Learnable [75.17357040707347]
We consider episodic reinforcement learning in reward-mixing Markov decision processes (RMMDPs)
Our goal is to learn a near-optimal policy that nearly maximizes the $H$ time-step cumulative rewards in such a model.
arXiv Detail & Related papers (2022-10-05T22:52:00Z) - Online Lewis Weight Sampling [62.38157566916501]
Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community.
Several works have extended this important primitive to other settings, including the online coreset, sliding window, and adversarial streaming models.
We design the first nearly optimal $ell_p$ subspace embeddings for all $pin(0,infty)$ in the online coreset, sliding window, and the adversarial streaming models.
arXiv Detail & Related papers (2022-07-17T19:40:51Z) - Chi-square and normal inference in high-dimensional multi-task
regression [7.310043452300736]
The paper proposes chi-square and normal methodologies for the unknown coefficient matrix $B*$ of size $ptimes T$ in a Multi-Task (MT) linear model.
arXiv Detail & Related papers (2021-07-16T11:19:49Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms.
We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling"
We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z) - Query complexity of heavy hitter estimation [6.373263986460191]
We consider the problem of identifying the subset $mathcalSgamma_mathcalP$ of elements in the support of an underlying distribution $mathcalP$.
We consider two query models: $(a)$ each query is an index $i$ and the oracle return the value $X_i$ and $(b)$ each query is a pair $(i,j)$.
For each of these query models, we design sequential estimation algorithms which at each round, either decide what query to send to the oracle depending on the entire
arXiv Detail & Related papers (2020-05-29T07:15:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.