A General Derivative Identity for the Conditional Mean Estimator in
Gaussian Noise and Some Applications
- URL: http://arxiv.org/abs/2104.01883v1
- Date: Mon, 5 Apr 2021 12:48:28 GMT
- Title: A General Derivative Identity for the Conditional Mean Estimator in
Gaussian Noise and Some Applications
- Authors: Alex Dytso, H. Vincent Poor, Shlomo Shamai (Shitz)
- Abstract summary: Several identities in the literature connect $E[bf X|bf Y=bf y]$ to other quantities such as the conditional variance, score functions, and higher-order conditional moments.
The objective of this paper is to provide a unifying view of these identities.
- Score: 128.4391178665731
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Consider a channel ${\bf Y}={\bf X}+ {\bf N}$ where ${\bf X}$ is an
$n$-dimensional random vector, and ${\bf N}$ is a Gaussian vector with a
covariance matrix ${\bf \mathsf{K}}_{\bf N}$. The object under consideration in
this paper is the conditional mean of ${\bf X}$ given ${\bf Y}={\bf y}$, that
is ${\bf y} \to E[{\bf X}|{\bf Y}={\bf y}]$. Several identities in the
literature connect $E[{\bf X}|{\bf Y}={\bf y}]$ to other quantities such as the
conditional variance, score functions, and higher-order conditional moments.
The objective of this paper is to provide a unifying view of these identities.
In the first part of the paper, a general derivative identity for the
conditional mean is derived. Specifically, for the Markov chain ${\bf U}
\leftrightarrow {\bf X} \leftrightarrow {\bf Y}$, it is shown that the Jacobian
of $E[{\bf U}|{\bf Y}={\bf y}]$ is given by ${\bf \mathsf{K}}_{{\bf N}}^{-1}
{\bf Cov} ( {\bf X}, {\bf U} | {\bf Y}={\bf y})$.
In the second part of the paper, via various choices of ${\bf U}$, the new
identity is used to generalize many of the known identities and derive some new
ones. First, a simple proof of the Hatsel and Nolte identity for the
conditional variance is shown. Second, a simple proof of the recursive identity
due to Jaffer is provided. Third, a new connection between the conditional
cumulants and the conditional expectation is shown. In particular, it is shown
that the $k$-th derivative of $E[X|Y=y]$ is the $(k+1)$-th conditional
cumulant.
The third part of the paper considers some applications. In a first
application, the power series and the compositional inverse of $E[X|Y=y]$ are
derived. In a second application, the distribution of the estimator error
$(X-E[X|Y])$ is derived. In a third application, we construct consistent
estimators (empirical Bayes estimators) of the conditional cumulants from an
i.i.d. sequence $Y_1,...,Y_n$.
Related papers
- An advance in the arithmetic of the Lie groups as an alternative to the
forms of the Campbell-Baker-Hausdorff-Dynkin theorem [0.7373617024876725]
The exponential of an operator or matrix is widely used in quantum theory, but it sometimes can be a challenge to evaluate.
Here it is proven that $rm eabf X+bbf Y$ is equivalent to $rm epbf Zrm eqbf Xrm e-pbf Z$ for scalar $p$ and $q$.
arXiv Detail & Related papers (2024-01-28T19:20:02Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Misspecified Phase Retrieval with Generative Priors [15.134280834597865]
We estimate an $n$-dimensional signal $mathbfx$ from $m$ i.i.d.realizations of the single index model $y.
We show that both steps enjoy a statistical rate of order $sqrt(klog L)cdot (log m)/m$ under suitable conditions.
arXiv Detail & Related papers (2022-10-11T16:04:11Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Private Convex Optimization via Exponential Mechanism [16.867534746193833]
We show that modifying the exponential mechanism by adding an $ellcave2$ regularizer to $F(x)$ recovers both the known optimal empirical risk and population loss under $(epsilon,delta)$-DP.
We also show how to implement this mechanism using $widetildeO(n min(d, n)) queries to $f_i(x) for the DP-SCO where $n$ is the number of samples/optimal and $d is the ambient dimension.
arXiv Detail & Related papers (2022-03-01T06:51:03Z) - The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning [3.8098187557917464]
The paper concerns the $d$-dimensional recursion approximation, $$theta_n+1= theta_n + alpha_n + 1 f(theta_n, Phi_n+1) $$ where $ Phi_n $ is a process on a general state space.
The main results are established under additional conditions on the mean flow and a version of the Donsker-Varadhan Lyapunov drift condition known as (DV3): (i) An appropriate Lyapunov
arXiv Detail & Related papers (2021-10-27T13:38:25Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Efficient Statistics for Sparse Graphical Models from Truncated Samples [19.205541380535397]
We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.
For sparse linear regression, suppose samples $(bf x,y)$ are generated where $y = bf xtopOmega* + mathcalN(0,1)$ and $(bf x, y)$ is seen only if $y$ belongs to a truncation set $S subseteq mathbbRd$.
arXiv Detail & Related papers (2020-06-17T09:21:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.