A Class of Two-Timescale Stochastic EM Algorithms for Nonconvex Latent
Variable Models
- URL: http://arxiv.org/abs/2203.10186v1
- Date: Fri, 18 Mar 2022 22:46:34 GMT
- Title: A Class of Two-Timescale Stochastic EM Algorithms for Nonconvex Latent
Variable Models
- Authors: Belhal Karimi and Ping Li
- Abstract summary: The Expectation-Maximization (EM) algorithm is a popular choice for learning variable models.
In this paper, we propose a general class of methods called Two-Time Methods.
- Score: 21.13011760066456
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Expectation-Maximization (EM) algorithm is a popular choice for learning
latent variable models. Variants of the EM have been initially introduced,
using incremental updates to scale to large datasets, and using Monte Carlo
(MC) approximations to bypass the intractable conditional expectation of the
latent data for most nonconvex models. In this paper, we propose a general
class of methods called Two-Timescale EM Methods based on a two-stage approach
of stochastic updates to tackle an essential nonconvex optimization task for
latent variable models. We motivate the choice of a double dynamic by invoking
the variance reduction virtue of each stage of the method on both sources of
noise: the index sampling for the incremental update and the MC approximation.
We establish finite-time and global convergence bounds for nonconvex objective
functions. Numerical applications on various models such as deformable template
for image analysis or nonlinear models for pharmacokinetics are also presented
to illustrate our findings.
Related papers
- Variational Neural Stochastic Differential Equations with Change Points [4.692174333076032]
We explore modeling change points in time-series data using neural differential equations (neural SDEs)
We propose a novel model formulation and training procedure based on the variational autoencoder (VAE) framework for modeling time-series as a neural SDE.
We present an empirical evaluation that demonstrates the expressive power of our proposed model, showing that it can effectively model both classical parametric SDEs and some real datasets with distribution shifts.
arXiv Detail & Related papers (2024-11-01T14:46:17Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Estimation of Switched Markov Polynomial NARX models [75.91002178647165]
We identify a class of models for hybrid dynamical systems characterized by nonlinear autoregressive (NARX) components.
The proposed approach is demonstrated on a SMNARX problem composed by three nonlinear sub-models with specific regressors.
arXiv Detail & Related papers (2020-09-29T15:00:47Z) - SODEN: A Scalable Continuous-Time Survival Model through Ordinary
Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms.
We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z) - Analysis of Bayesian Inference Algorithms by the Dynamical Functional
Approach [2.8021833233819486]
We analyze an algorithm for approximate inference with large Gaussian latent variable models in a student-trivial scenario.
For the case of perfect data-model matching, the knowledge of static order parameters derived from the replica method allows us to obtain efficient algorithmic updates.
arXiv Detail & Related papers (2020-01-14T17:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.