Fused-Lasso Regularized Cholesky Factors of Large Nonstationary
Covariance Matrices of Longitudinal Data
- URL: http://arxiv.org/abs/2007.11168v1
- Date: Wed, 22 Jul 2020 02:38:16 GMT
- Title: Fused-Lasso Regularized Cholesky Factors of Large Nonstationary
Covariance Matrices of Longitudinal Data
- Authors: Aramayis Dallakyan and Mohsen Pourahmadi
- Abstract summary: smoothness of the subdiagonals of the Cholesky factor of large covariance matrices is closely related to the degrees of nonstationarity of autoregressive models for time series and longitudinal data.
We propose an algorithm for sparse estimation of the Cholesky factor which decouple row-wise.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Smoothness of the subdiagonals of the Cholesky factor of large covariance
matrices is closely related to the degrees of nonstationarity of autoregressive
models for time series and longitudinal data. Heuristically, one expects for a
nearly stationary covariance matrix the entries in each subdiagonal of the
Cholesky factor of its inverse to be nearly the same in the sense that sum of
absolute values of successive terms is small. Statistically such smoothness is
achieved by regularizing each subdiagonal using fused-type lasso penalties. We
rely on the standard Cholesky factor as the new parameters within a regularized
normal likelihood setup which guarantees: (1) joint convexity of the likelihood
function, (2) strict convexity of the likelihood function restricted to each
subdiagonal even when $n<p$, and (3) positive-definiteness of the estimated
covariance matrix. A block coordinate descent algorithm, where each block is a
subdiagonal, is proposed and its convergence is established under mild
conditions. Lack of decoupling of the penalized likelihood function into a sum
of functions involving individual subdiagonals gives rise to some computational
challenges and advantages relative to two recent algorithms for sparse
estimation of the Cholesky factor which decouple row-wise. Simulation results
and real data analysis show the scope and good performance of the proposed
methodology.
Related papers
- Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Near optimal sample complexity for matrix and tensor normal models via
geodesic convexity [5.191641077435773]
We show nonasymptotic bounds for the error achieved by the maximum likelihood estimator (MLE) in several natural metrics.
In the same regimes as our sample complexity bounds, we show that an iterative procedure to compute the MLE known as the flip-flop algorithm converges linearly with high probability.
arXiv Detail & Related papers (2021-10-14T17:47:00Z) - Sparse Quadratic Optimisation over the Stiefel Manifold with Application
to Permutation Synchronisation [71.27989298860481]
We address the non- optimisation problem of finding a matrix on the Stiefel manifold that maximises a quadratic objective function.
We propose a simple yet effective sparsity-promoting algorithm for finding the dominant eigenspace matrix.
arXiv Detail & Related papers (2021-09-30T19:17:35Z) - EiGLasso for Scalable Sparse Kronecker-Sum Inverse Covariance Estimation [1.370633147306388]
We introduce EiGLasso, a highly scalable method for sparse Kronecker-sum inverse covariance estimation.
We show that EiGLasso achieves two to three orders-of-magnitude speed-up compared to the existing methods.
arXiv Detail & Related papers (2021-05-20T16:22:50Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function.
We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z) - Explicit Regularization of Stochastic Gradient Methods through Duality [9.131027490864938]
We propose randomized Dykstra-style algorithms based on randomized dual coordinate ascent.
For accelerated coordinate descent, we obtain a new algorithm that has better convergence properties than existing gradient methods in the interpolating regime.
arXiv Detail & Related papers (2020-03-30T20:44:56Z) - A Block Coordinate Descent-based Projected Gradient Algorithm for
Orthogonal Non-negative Matrix Factorization [0.0]
This article utilizes the projected gradient method (PG) for a non-negative matrix factorization problem (NMF)
We penalise the orthonormality constraints and apply the PG method via a block coordinate descent approach.
arXiv Detail & Related papers (2020-03-23T13:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.