A Computationally Efficient Method for Learning Exponential Family
Distributions
- URL: http://arxiv.org/abs/2110.15397v1
- Date: Thu, 28 Oct 2021 18:42:04 GMT
- Title: A Computationally Efficient Method for Learning Exponential Family
Distributions
- Authors: Abhin Shah, Devavrat Shah, Gregory W. Wornell
- Abstract summary: We consider the question of learning the natural parameters of a $k$ parameter exponential family from i.i.d. samples in a computationally and statistically efficient manner.
We show that our method can be viewed as the maximum likelihood estimation of a re-ized distribution belonging to the same class of exponential family.
- Score: 29.289136623702056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the question of learning the natural parameters of a $k$
parameter minimal exponential family from i.i.d. samples in a computationally
and statistically efficient manner. We focus on the setting where the support
as well as the natural parameters are appropriately bounded. While the
traditional maximum likelihood estimator for this class of exponential family
is consistent, asymptotically normal, and asymptotically efficient, evaluating
it is computationally hard. In this work, we propose a computationally
efficient estimator that is consistent as well as asymptotically normal under
mild conditions. We provide finite sample guarantees to achieve an ($\ell_2$)
error of $\alpha$ in the parameter estimation with sample complexity
$O(\mathrm{poly}(k/\alpha))$ and computational complexity
${O}(\mathrm{poly}(k/\alpha))$. To establish these results, we show that, at
the population level, our method can be viewed as the maximum likelihood
estimation of a re-parameterized distribution belonging to the same class of
exponential family.
Related papers
- On Computationally Efficient Learning of Exponential Family
Distributions [33.229944519289795]
We focus on the setting where the support as well as the natural parameters are appropriately bounded.
Our method achives the order-optimal sample complexity of $O(sf log(k)/alpha2)$ when tailored for node-wise-sparse random fields.
arXiv Detail & Related papers (2023-09-12T17:25:32Z) - Provable benefits of score matching [30.317535687908755]
We give the first example of a natural exponential family of distributions such that score matching loss is computationally efficient to optimize.
We show that designing a zeroth-order or first-order oracle for optimizing the likelihood loss is NP-hard.
Minimizing the score matching loss is both computationally and statistically efficient, with complexity in the ambient dimension.
arXiv Detail & Related papers (2023-06-03T03:42:30Z) - Large Dimensional Independent Component Analysis: Statistical Optimality
and Computational Tractability [13.104413212606577]
We investigate the optimal statistical performance and the impact of computational constraints for independent component analysis (ICA)
We show that the optimal sample complexity is linear in dimensionality.
We develop computationally tractable estimates that attain both the optimal sample complexity and minimax optimal rates of convergence.
arXiv Detail & Related papers (2023-03-31T15:46:30Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Asymptotic Bounds for Smoothness Parameter Estimates in Gaussian Process
Interpolation [3.8979646385036175]
The smoothness of a Mat'ern kernel determines many important properties of the model in the large data limit.
We prove that the maximum likelihood estimate of the smoothness parameter cannot be understated under the truth.
We show that maximum likelihood estimation recovers the true smoothness for a class of compactly supported self-similar functions.
arXiv Detail & Related papers (2022-03-10T14:45:57Z) - On Stochastic Moving-Average Estimators for Non-Convex Optimization [105.22760323075008]
In this paper, we demonstrate the power of a widely used estimator based on moving average (SEMA) problems.
For all these-the-art results, we also present the results for all these-the-art problems.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Reinforcement Learning with General Value Function Approximation:
Provably Efficient Approach via Bounded Eluder Dimension [124.7752517531109]
We establish a provably efficient reinforcement learning algorithm with general value function approximation.
We show that our algorithm achieves a regret bound of $widetildeO(mathrmpoly(dH)sqrtT)$ where $d$ is a complexity measure.
Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment.
arXiv Detail & Related papers (2020-05-21T17:36:09Z) - Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis [102.29671176698373]
We address the problem of policy evaluation in discounted decision processes, and provide Markov-dependent guarantees on the $ell_infty$error under a generative model.
We establish both and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms.
arXiv Detail & Related papers (2020-03-16T17:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.