Beyond Coordinates: Meta-Equivariance in Statistical Inference
- URL: http://arxiv.org/abs/2504.10667v1
- Date: Mon, 14 Apr 2025 19:40:39 GMT
- Title: Beyond Coordinates: Meta-Equivariance in Statistical Inference
- Authors: William Cook,
- Abstract summary: Optimal statistical decisions should transcend the language used to describe them.<n>How do we guarantee that the choice of coordinates does not subtly dictate the solution?<n>We first analyse the optimal combination of twoally normal estimators under a strictly convex trace-AMSE risk.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimal statistical decisions should transcend the language used to describe them. Yet, how do we guarantee that the choice of coordinates - the parameterisation of an optimisation problem - does not subtly dictate the solution? This paper reveals a fundamental geometric invariance principle. We first analyse the optimal combination of two asymptotically normal estimators under a strictly convex trace-AMSE risk. While methods for finding optimal weights are known, we prove that the resulting optimal estimator is invariant under direct affine reparameterisations of the weighting scheme. This exemplifies a broader principle we term meta-equivariance: the unique minimiser of any strictly convex, differentiable scalar objective over a matrix space transforms covariantly under any invertible affine reparameterisation of that space. Distinct from classical statistical equivariance tied to data symmetries, meta-equivariance arises from the immutable geometry of convex optimisation itself. It guarantees that optimality, in these settings, is not an artefact of representation but an intrinsic, coordinate-free truth.
Related papers
- Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Stochastic Hessian Fittings with Lie Groups [6.626539885456148]
Hessian fitting criterion from the preconditioned gradient descent (PSGD) method is studied.
Hessian fitting itself as an optimization problem is strongly convex under mild conditions in certain general Lie groups.
This discovery turns Hessian fitting into a well-behaved Lie group optimization problem.
arXiv Detail & Related papers (2024-02-19T06:00:35Z) - Convex Parameter Estimation of Perturbed Multivariate Generalized
Gaussian Distributions [18.95928707619676]
We propose a convex formulation with well-established properties for MGGD parameters.
The proposed framework is flexible as it combines a variety of regularizations for the precision matrix, the mean and perturbations.
Experiments show a more accurate precision and covariance matrix estimation with similar performance for the mean vector parameter.
arXiv Detail & Related papers (2023-12-12T18:08:04Z) - Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing [28.91482208876914]
We consider the problem of parameter estimation in a high-dimensional generalized linear model.
Despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured designs.
arXiv Detail & Related papers (2023-08-28T11:49:23Z) - Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models.
We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix.
Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z) - Unified Convergence Analysis for Adaptive Optimization with Moving Average Estimator [75.05106948314956]
We show that an increasing large momentum parameter for the first-order moment is sufficient for adaptive scaling.<n>We also give insights for increasing the momentum in a stagewise manner in accordance with stagewise decreasing step size.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - The computational asymptotics of Gaussian variational inference and the
Laplace approximation [19.366538729532856]
We provide a theoretical analysis of the convexity properties of variational inference with a Gaussian family.
We show that with scaled real-data examples both CSVI and CSV improve the likelihood of obtaining the global optimum of their respective optimization problems.
arXiv Detail & Related papers (2021-04-13T01:23:34Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.