Model-free filtering in high dimensions via projection and score-based diffusions
- URL: http://arxiv.org/abs/2510.23197v1
- Date: Mon, 27 Oct 2025 10:34:46 GMT
- Title: Model-free filtering in high dimensions via projection and score-based diffusions
- Authors: Sören Christensen, Jan Kallsen, Claudia Strauch, Lukas Trottner,
- Abstract summary: We compute an estimator of the metric projection $pi_mathscrM(Y)$ of $Y$ onto the manifold $mathscrM$.<n>Our main theoretical results show that, in the limit of high dimension $d$, this posterior $mathbbPXmid Y$ is concentrated near the desired metric projection.
- Score: 1.3066182802188202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of recovering a latent signal $X$ from its noisy observation $Y$. The unknown law $\mathbb{P}^X$ of $X$, and in particular its support $\mathscr{M}$, are accessible only through a large sample of i.i.d.\ observations. We further assume $\mathscr{M}$ to be a low-dimensional submanifold of a high-dimensional Euclidean space $\mathbb{R}^d$. As a filter or denoiser $\widehat X$, we suggest an estimator of the metric projection $\pi_{\mathscr{M}}(Y)$ of $Y$ onto the manifold $\mathscr{M}$. To compute this estimator, we study an auxiliary semiparametric model in which $Y$ is obtained by adding isotropic Laplace noise to $X$. Using score matching within a corresponding diffusion model, we obtain an estimator of the Bayesian posterior $\mathbb{P}^{X \mid Y}$ in this setup. Our main theoretical results show that, in the limit of high dimension $d$, this posterior $\mathbb{P}^{X\mid Y}$ is concentrated near the desired metric projection $\pi_{\mathscr{M}}(Y)$.
Related papers
- Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination [65.37519531362157]
We show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $tildeOmega(d1/2/alpha2)$.
arXiv Detail & Related papers (2025-10-12T15:42:44Z) - Spike-and-Slab Posterior Sampling in High Dimensions [11.458504242206862]
Posterior sampling with the spike-and-slab prior [MB88] is considered the theoretical gold standard method for Bayesian sparse linear regression.<n>We give the first provable algorithms for spike-and-slab posterior sampling that apply for any SNR, and use a measurement count sub in the problem dimension.<n>We extend our result to spike-and-slab posterior sampling with Laplace diffuse densities, achieving similar guarantees when $sigma = O(frac1k)$ is bounded.
arXiv Detail & Related papers (2025-03-04T17:16:07Z) - Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$<n>We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks [8.74634652691576]
We consider a cloud of $n data points in $mathbbRd$, consider all projections onto $m$-dimensional subspaces of $mathbbRd$ and, for each such projection, the empirical distribution of the projected points.<n>What does this collection of probability distributions look like when $n,d$ grow large?<n>We prove sharp bounds in terms of Kullback-Leibler divergence and R'enyi information dimension.
arXiv Detail & Related papers (2022-06-14T00:07:33Z) - Beyond Independent Measurements: General Compressed Sensing with GNN
Application [4.924126492174801]
We consider the problem of recovering a structured signal $mathbfx in mathbbRn$ from noisy cone observations.
We show that the effective rank of $mathbfB$ may be used as a surrogate for the number of measurements.
arXiv Detail & Related papers (2021-10-30T20:35:56Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Non-Parametric Estimation of Manifolds from Noisy Data [1.0152838128195467]
We consider the problem of estimating a $d$ dimensional sub-manifold of $mathbbRD$ from a finite set of noisy samples.
We show that the estimation yields rates of convergence of $n-frack2k + d$ for the point estimation and $n-frack-12k + d$ for the estimation of tangent space.
arXiv Detail & Related papers (2021-05-11T02:29:33Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Efficient Statistics for Sparse Graphical Models from Truncated Samples [19.205541380535397]
We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.
For sparse linear regression, suppose samples $(bf x,y)$ are generated where $y = bf xtopOmega* + mathcalN(0,1)$ and $(bf x, y)$ is seen only if $y$ belongs to a truncation set $S subseteq mathbbRd$.
arXiv Detail & Related papers (2020-06-17T09:21:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.