Tree density estimation
- URL: http://arxiv.org/abs/2111.11971v1
- Date: Tue, 23 Nov 2021 16:05:59 GMT
- Title: Tree density estimation
- Authors: L\'aszl\'o Gy\"orfi and Aryeh Kontorovich and Roi Weiss
- Abstract summary: density estimation for a random vector $boldsymbol X$ in $mathbb Rd$ with probability density $f(boldsymbol x)$.
For Lipschitz continuous $f$ with bounded support, $mathbb E int |f_n(boldsymbol x)-fT*(boldsymbol x)|dboldsymbol x=0$ a.s.
For Lipschitz continuous $f$ with bounded support, $mathbb E int |f_n(boldsymbol x)-f
- Score: 12.831051269764115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of density estimation for a random vector ${\boldsymbol
X}$ in $\mathbb R^d$ with probability density $f(\boldsymbol x)$. For a
spanning tree $T$ defined on the vertex set $\{1,\dots ,d\}$, the tree density
$f_{T}$ is a product of bivariate conditional densities. The optimal spanning
tree $T^*$ is the spanning tree $T$, for which the Kullback-Leibler divergence
of $f$ and $f_{T}$ is the smallest. From i.i.d. data we identify the optimal
tree $T^*$ and computationally efficiently construct a tree density estimate
$f_n$ such that, without any regularity conditions on the density $f$, one has
that $\lim_{n\to \infty} \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol
x)|d\boldsymbol x=0$ a.s. For Lipschitz continuous $f$ with bounded support,
$\mathbb E\{ \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol
x\}=O(n^{-1/4})$.
Related papers
- A note on estimating the dimension from a random geometric graph [2.3020018305241337]
We study the problem of estimating the dimension $d$ of the underlying space when we have access to the adjacency matrix of the graph.
We also show that, without any condition on the density, a consistent estimator of $d$ exists when $n r_nd to infty$ and $r_n = o(1)$.
arXiv Detail & Related papers (2023-11-21T23:46:44Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - The Approximate Degree of DNF and CNF Formulas [95.94432031144716]
For every $delta>0,$ we construct CNF and formulas of size with approximate degree $Omega(n1-delta),$ essentially matching the trivial upper bound of $n.
We show that for every $delta>0$, these models require $Omega(n1-delta)$, $Omega(n/4kk2)1-delta$, and $Omega(n/4kk2)1-delta$, respectively.
arXiv Detail & Related papers (2022-09-04T10:01:39Z) - Near-optimal fitting of ellipsoids to random points [68.12685213894112]
A basic problem of fitting an ellipsoid to random points has connections to low-rank matrix decompositions, independent component analysis, and principal component analysis.
We resolve this conjecture up to logarithmic factors by constructing a fitting ellipsoid for some $n = Omega(, d2/mathrmpolylog(d),)$.
Our proof demonstrates feasibility of the least squares construction of Saunderson et al. using a convenient decomposition of a certain non-standard random matrix.
arXiv Detail & Related papers (2022-08-19T18:00:34Z) - Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm.
Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z) - Coresets for Decision Trees of Signals [19.537354146654845]
We provide the first algorithm that outputs such a $(k,varepsilon)$-coreset for emphevery such matrix $D$.
This is by forging a link between decision trees from machine learning -- to partition trees in computational geometry.
arXiv Detail & Related papers (2021-10-07T05:49:55Z) - Self-training Converts Weak Learners to Strong Learners in Mixture
Models [86.7137362125503]
We show that a pseudolabeler $boldsymbolbeta_mathrmpl$ can achieve classification error at most $C_mathrmerr$.
We additionally show that by running gradient descent on the logistic loss one can obtain a pseudolabeler $boldsymbolbeta_mathrmpl$ with classification error $C_mathrmerr$ using only $O(d)$ labeled examples.
arXiv Detail & Related papers (2021-06-25T17:59:16Z) - Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu [14.298220510927695]
We provide finite sample guarantees for the classical ChowLiu algorithm (IEEE Trans.Inform.Theory, 1968)
We show that for a specific tree $T$, with $widetildeO (|Sigma|2nvarepsilon-1)$ samples from a distribution $P$ over $Sigman$, one can efficiently learn the closest KL divergence.
arXiv Detail & Related papers (2020-11-09T02:08:56Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - Optimal Combination of Linear and Spectral Estimators for Generalized
Linear Models [59.015960528781115]
We show how to optimally combine $hatboldsymbol xrm L$ and $hatboldsymbol xrm s$.
In order to establish the limiting distribution of $(boldsymbol x, hatboldsymbol xrm L, hatboldsymbol xrm s)$, we design and analyze an Approximate Message Passing (AMP) algorithm.
arXiv Detail & Related papers (2020-08-07T18:20:05Z) - Optimal Coreset for Gaussian Kernel Density Estimation [0.8376091455761259]
Given a point set $Psubset mathbbRd$, the kernel density estimate of $P$ is defined as [ overlinemathcalG_P(x) = frac1left|Pright|sum_pin Pe-leftlVert x-p rightrVert2 ] for any $xinmathbbRd$.
We study how to construct a small subset $Q$ of $P
arXiv Detail & Related papers (2020-07-15T22:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.