Statistical optimality and stability of tangent transform algorithms in
logit models
- URL: http://arxiv.org/abs/2010.13039v1
- Date: Sun, 25 Oct 2020 05:15:13 GMT
- Title: Statistical optimality and stability of tangent transform algorithms in
logit models
- Authors: Indrajit Ghosh, Anirban Bhattacharya and Debdeep Pati
- Abstract summary: We provide conditions on the data generating process to derive non-asymptotic upper bounds to the risk incurred by the logistical optima.
In particular, we establish local variation of the algorithm without any assumptions on the data-generating process.
We explore a special case involving a semi-orthogonal design under which a global convergence is obtained.
- Score: 6.9827388859232045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A systematic approach to finding variational approximation in an otherwise
intractable non-conjugate model is to exploit the general principle of convex
duality by minorizing the marginal likelihood that renders the problem
tractable. While such approaches are popular in the context of variational
inference in non-conjugate Bayesian models, theoretical guarantees on
statistical optimality and algorithmic convergence are lacking. Focusing on
logistic regression models, we provide mild conditions on the data generating
process to derive non-asymptotic upper bounds to the risk incurred by the
variational optima. We demonstrate that these assumptions can be completely
relaxed if one considers a slight variation of the algorithm by raising the
likelihood to a fractional power. Next, we utilize the theory of dynamical
systems to provide convergence guarantees for such algorithms in logistic and
multinomial logit regression. In particular, we establish local asymptotic
stability of the algorithm without any assumptions on the data-generating
process. We explore a special case involving a semi-orthogonal design under
which a global convergence is obtained. The theory is further illustrated using
several numerical studies.
Related papers
- A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - kNN Algorithm for Conditional Mean and Variance Estimation with
Automated Uncertainty Quantification and Variable Selection [8.429136647141487]
We introduce a kNN-based regression method that synergizes the scalability and adaptability of traditional non-parametric kNN models.
This method focuses on accurately estimating the conditional mean and variance of random response variables.
It is particularly notable in biomedical applications as demonstrated in two case studies.
arXiv Detail & Related papers (2024-02-02T18:54:18Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization [29.24821214671497]
Training machine learning and statistical models often involve optimizing a data-driven risk criterion.
We propose a novel robust criterion by combining insights from Bayesian nonparametric (i.e., Dirichlet process) theory and a recent decision-theoretic model of smooth ambiguity-averse preferences.
For practical implementation, we propose and study tractable approximations of the criterion based on well-known Dirichlet process representations.
arXiv Detail & Related papers (2024-01-28T21:19:15Z) - Semi-Parametric Inference for Doubly Stochastic Spatial Point Processes: An Approximate Penalized Poisson Likelihood Approach [3.085995273374333]
Doubly-stochastic point processes model the occurrence of events over a spatial domain as an inhomogeneous process conditioned on the realization of a random intensity function.
Existing implementations of doubly-stochastic spatial models are computationally demanding, often have limited theoretical guarantee, and/or rely on restrictive assumptions.
arXiv Detail & Related papers (2023-06-11T19:48:39Z) - Distributed Bayesian Learning of Dynamic States [65.7870637855531]
The proposed algorithm is a distributed Bayesian filtering task for finite-state hidden Markov models.
It can be used for sequential state estimation, as well as for modeling opinion formation over social networks under dynamic environments.
arXiv Detail & Related papers (2022-12-05T19:40:17Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Posterior-Aided Regularization for Likelihood-Free Inference [23.708122045184698]
Posterior-Aided Regularization (PAR) is applicable to learning the density estimator, regardless of the model structure.
We provide a unified estimation method of PAR to estimate both reverse KL term and mutual information term with a single neural network.
arXiv Detail & Related papers (2021-02-15T16:59:30Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Distributed Stochastic Nonconvex Optimization and Learning based on
Successive Convex Approximation [26.11677569331688]
We introduce a novel framework for the distributed algorithmic minimization of the sum of the sum of the agents in a network.
We show that the proposed method can be applied to distributed neural networks.
arXiv Detail & Related papers (2020-04-30T15:36:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.