Related papers: Product distribution learning with imperfect advice

Product distribution learning with imperfect advice

URL: http://arxiv.org/abs/2511.10366v1
Date: Fri, 14 Nov 2025 01:47:30 GMT
Title: Product distribution learning with imperfect advice
Authors: Arnab Bhattacharyya, Davin Choo, Philips George John, Themis Gouleakis,
Abstract summary: Given i.i.d.samples from an unknown distribution $P$, the goal of distribution learning is to recover the parameters of a distribution that is close to $P$.<n>We show that there is an efficient algorithm to learn $P$ within TV distance $varepsilon$ that has sample complexity $tildeO(d1-/varepsilon2)$.
Score: 16.179400847403446
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Given i.i.d.~samples from an unknown distribution $P$, the goal of distribution learning is to recover the parameters of a distribution that is close to $P$. When $P$ belongs to the class of product distributions on the Boolean hypercube $\{0,1\}^d$, it is known that $Ω(d/\varepsilon^2)$ samples are necessary to learn $P$ within total variation (TV) distance $\varepsilon$. We revisit this problem when the learner is also given as advice the parameters of a product distribution $Q$. We show that there is an efficient algorithm to learn $P$ within TV distance $\varepsilon$ that has sample complexity $\tilde{O}(d^{1-η}/\varepsilon^2)$, if $\|\mathbf{p} - \mathbf{q}\|_1 < \varepsilon d^{0.5 - Ω(η)}$. Here, $\mathbf{p}$ and $\mathbf{q}$ are the mean vectors of $P$ and $Q$ respectively, and no bound on $\|\mathbf{p} - \mathbf{q}\|_1$ is known to the algorithm a priori.

Related papers

High-accuracy sampling for diffusion models and log-concave distributions [70.90863485771405]
We present algorithms for diffusion model sampling which obtain $$-error in $mathrmpolylog (1/)$ steps.<n>Our approach also yields the first $mathrmpolylog (1/)$ complexity sampler for general log-concave distributions.
arXiv Detail & Related papers (2026-02-01T17:05:31Z)
Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination [65.37519531362157]
We show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $tildeOmega(d1/2/alpha2)$.
arXiv Detail & Related papers (2025-10-12T15:42:44Z)
New Algorithmic Directions in Optimal Transport and Applications for Product Spaces [5.9725566031600925]
We study optimal transport between two high-dimensional distributions $mu,nu$ in $Rn$ from an algorithmic perspective.<n>Running time depends on the dimension rather than the full representation size of $mu,nu$.<n>For any $mathcalS$ of Gaussian measure $varepsilon$, most $Phin$ samples can be mapped to $mathcalS$ within distance $O(sqrtlog 1/varepsilon)$ in $poly(n/varepsilon)$
arXiv Detail & Related papers (2025-09-25T19:58:06Z)
Mean and Variance Estimation Complexity in Arbitrary Distributions via Wasserstein Minimization [0.0]
This paper focuses on the complexity of estimating translation translation $boldsymbolmu in mathbbRl$ and shrinkage $sigma in mathbbR_++$ parameters.<n>We highlight that while the problem is NP-hard for Maximum Likelihood Estimation (MLE), it is possible to obtain $varepsilon$-approxs for arbitrary $varepsilon > 0$ within $textpoly left( frac1varepsilon )$ time using the
arXiv Detail & Related papers (2025-01-17T13:07:52Z)
Learning multivariate Gaussians with imperfect advice [14.459107410136486]
We revisit the problem of distribution learning within the framework of learning-augmented algorithms.<n>Our objective is to develop learning algorithms whose sample complexity decreases as the quality of the advice improves.
arXiv Detail & Related papers (2024-11-19T18:08:01Z)
On Learning for Ambiguous Chance Constrained Problems [2.7152798636894193]
We show that in this case the original problem can be well-approximated'' by a sampled problem in which $N$ i.i.d. samples of $theta$ are drawn from $nu$. We also derive the sample complexity associated with this approximation, i.e., for $epsilon,delta>0$ the number of samples which must be drawn from $nu$.
arXiv Detail & Related papers (2023-12-31T17:25:43Z)
Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem. The goal is to approximate $mathbfA$ as a product of low-rank factors. Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z)
Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix. It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$ Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z)
Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals. Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z)
Self-training Converts Weak Learners to Strong Learners in Mixture Models [86.7137362125503]
We show that a pseudolabeler $boldsymbolbeta_mathrmpl$ can achieve classification error at most $C_mathrmerr$. We additionally show that by running gradient descent on the logistic loss one can obtain a pseudolabeler $boldsymbolbeta_mathrmpl$ with classification error $C_mathrmerr$ using only $O(d)$ labeled examples.
arXiv Detail & Related papers (2021-06-25T17:59:16Z)
Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu [14.298220510927695]
We provide finite sample guarantees for the classical ChowLiu algorithm (IEEE Trans.Inform.Theory, 1968) We show that for a specific tree $T$, with $widetildeO (|Sigma|2nvarepsilon-1)$ samples from a distribution $P$ over $Sigman$, one can efficiently learn the closest KL divergence.
arXiv Detail & Related papers (2020-11-09T02:08:56Z)
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy. For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
Learning Mixtures of Spherical Gaussians via Fourier Analysis [0.5381004207943596]
We find that a bound on the sample and computational complexity was previously unknown when $omega(1) leq d leq O(log k)$. These authors also show that the sample of complexity of a random mixture of gaussians in a ball of radius $d$ in $d$ dimensions, when $d$ is $Theta(sqrtd)$ in $d$ dimensions, when $d$ is at least $poly(k, frac1delta)$.
arXiv Detail & Related papers (2020-04-13T08:06:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.