A non-asymptotic model selection in block-diagonal mixture of polynomial
experts models
- URL: http://arxiv.org/abs/2104.08959v1
- Date: Sun, 18 Apr 2021 21:32:20 GMT
- Title: A non-asymptotic model selection in block-diagonal mixture of polynomial
experts models
- Authors: TrungTin Nguyen, Faicel Chamroukhi, Hien Duy Nguyen, Florence Forbes
- Abstract summary: We introduce a penalized maximum likelihood selection criterion to estimate the unknown conditional density of a regression model.
We provide a strong theoretical guarantee, including a finite-sample oracle satisfied by the penalized maximum likelihood with a Jensen-Kullback-Leibler type loss.
- Score: 1.491109220586182
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model selection via penalized likelihood type criteria is a standard task in
many statistical inference and machine learning problems. It has led to
deriving criteria with asymptotic consistency results and an increasing
emphasis on introducing non-asymptotic criteria. We focus on the problem of
modeling non-linear relationships in regression data with potential hidden
graph-structured interactions between the high-dimensional predictors, within
the mixture of experts modeling framework. In order to deal with such a complex
situation, we investigate a block-diagonal localized mixture of polynomial
experts (BLoMPE) regression model, which is constructed upon an inverse
regression and block-diagonal structures of the Gaussian expert covariance
matrices. We introduce a penalized maximum likelihood selection criterion to
estimate the unknown conditional density of the regression model. This model
selection criterion allows us to handle the challenging problem of inferring
the number of mixture components, the degree of polynomial mean functions, and
the hidden block-diagonal structures of the covariance matrices, which reduces
the number of parameters to be estimated and leads to a trade-off between
complexity and sparsity in the model. In particular, we provide a strong
theoretical guarantee: a finite-sample oracle inequality satisfied by the
penalized maximum likelihood estimator with a Jensen-Kullback-Leibler type
loss, to support the introduced non-asymptotic model selection criterion. The
penalty shape of this criterion depends on the complexity of the considered
random subcollection of BLoMPE models, including the relevant graph structures,
the degree of polynomial mean functions, and the number of mixture components.
Related papers
- Statistical ranking with dynamic covariates [6.729750785106628]
We introduce an efficient alternating algorithm to compute the likelihood estimator (MLE)
A comprehensive numerical study is conducted to corroborate our theoretical findings and demonstrate the application of the proposed model to real-world datasets, including horse racing and tennis competitions.
arXiv Detail & Related papers (2024-06-24T10:26:05Z) - A Unified Analysis of Multi-task Functional Linear Regression Models
with Manifold Constraint and Composite Quadratic Penalty [0.0]
The power of multi-task learning is brought in by imposing additional structures over the slope functions.
We show the composite penalty induces a specific norm, which helps to quantify the manifold curvature.
A unified convergence upper bound is obtained and specifically applied to the reduced-rank model and the graph Laplacian regularized model.
arXiv Detail & Related papers (2022-11-09T13:32:23Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Sparse Bayesian Learning for Complex-Valued Rational Approximations [0.03392423750246091]
Surrogate models are used to alleviate the computational burden in engineering tasks.
These models show a strongly non-linear dependence on their input parameters.
We apply a sparse learning approach to the rational approximation.
arXiv Detail & Related papers (2022-06-06T12:06:13Z) - A Variational Inference Approach to Inverse Problems with Gamma
Hyperpriors [60.489902135153415]
This paper introduces a variational iterative alternating scheme for hierarchical inverse problems with gamma hyperpriors.
The proposed variational inference approach yields accurate reconstruction, provides meaningful uncertainty quantification, and is easy to implement.
arXiv Detail & Related papers (2021-11-26T06:33:29Z) - A non-asymptotic penalization criterion for model selection in mixture
of experts models [1.491109220586182]
We consider the Gaussian-gated localized MoE (GLoME) regression model for modeling heterogeneous data.
This model poses challenging questions with respect to the statistical estimation and model selection problems.
We study the problem of estimating the number of components of the GLoME model, in a penalized maximum likelihood estimation framework.
arXiv Detail & Related papers (2021-04-06T16:24:55Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Estimation of Switched Markov Polynomial NARX models [75.91002178647165]
We identify a class of models for hybrid dynamical systems characterized by nonlinear autoregressive (NARX) components.
The proposed approach is demonstrated on a SMNARX problem composed by three nonlinear sub-models with specific regressors.
arXiv Detail & Related papers (2020-09-29T15:00:47Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Blocked Clusterwise Regression [0.0]
We generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple latent variables.
We contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting.
arXiv Detail & Related papers (2020-01-29T23:29:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.