Confidence Sets under Generalized Self-Concordance
- URL: http://arxiv.org/abs/2301.00260v1
- Date: Sat, 31 Dec 2022 17:45:11 GMT
- Title: Confidence Sets under Generalized Self-Concordance
- Authors: Lang Liu and Zaid Harchaoui
- Abstract summary: This paper revisits a fundamental problem in statistical from a non-asymptotic theoretical viewpoint.
We establish an exponential-bound for the estimator characterizing its behavior in a non-asymptotic fashion.
An important trace of its dependency is captured by the effective dimension.
- Score: 2.0305676256390934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper revisits a fundamental problem in statistical inference from a
non-asymptotic theoretical viewpoint $\unicode{x2013}$ the construction of
confidence sets. We establish a finite-sample bound for the estimator,
characterizing its asymptotic behavior in a non-asymptotic fashion. An
important feature of our bound is that its dimension dependency is captured by
the effective dimension $\unicode{x2013}$ the trace of the limiting sandwich
covariance $\unicode{x2013}$ which can be much smaller than the parameter
dimension in some regimes. We then illustrate how the bound can be used to
obtain a confidence set whose shape is adapted to the optimization landscape
induced by the loss function. Unlike previous works that rely heavily on the
strong convexity of the loss function, we only assume the Hessian is lower
bounded at optimum and allow it to gradually becomes degenerate. This property
is formalized by the notion of generalized self-concordance which originated
from convex optimization. Moreover, we demonstrate how the effective dimension
can be estimated from data and characterize its estimation accuracy. We apply
our results to maximum likelihood estimation with generalized linear models,
score matching with exponential families, and hypothesis testing with Rao's
score test.
Related papers
- Optimal convex $M$-estimation via score matching [6.115859302936817]
We construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal variance in the downstream estimation of the regression coefficients.
Our semiparametric approach targets the best decreasing approximation of the derivative of the derivative of the log-density of the noise distribution.
arXiv Detail & Related papers (2024-03-25T12:23:19Z) - Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference [9.940560505044122]
We propose a method to improve the efficiency and accuracy of amortized Bayesian inference.
We estimate the marginal likelihood based on approximate representations of the joint model.
arXiv Detail & Related papers (2023-10-06T17:41:41Z) - Adaptive Linear Estimating Equations [5.985204759362746]
In this paper, we propose a general method for constructing debiased estimator.
It makes use of the idea of adaptive linear estimating equations, and we establish theoretical guarantees of normality.
A salient feature of our estimator is that in the context of multi-armed bandits, our estimator retains the non-asymptotic performance.
arXiv Detail & Related papers (2023-07-14T12:55:47Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Online Bootstrap Inference with Nonconvex Stochastic Gradient Descent
Estimator [0.0]
In this paper, we investigate the theoretical properties of gradient descent (SGD) for statistical inference in the context of convex problems.
We propose two coferential procedures which may contain multiple error minima.
arXiv Detail & Related papers (2023-06-03T22:08:10Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Sharp Asymptotics and Optimal Performance for Inference in Binary Models [41.7567932118769]
We study convex empirical risk for high-dimensional inference in binary models.
For binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one.
arXiv Detail & Related papers (2020-02-17T22:32:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.