Supervised Multivariate Learning with Simultaneous Feature Auto-grouping
and Dimension Reduction
- URL: http://arxiv.org/abs/2112.09746v1
- Date: Fri, 17 Dec 2021 20:11:20 GMT
- Title: Supervised Multivariate Learning with Simultaneous Feature Auto-grouping
and Dimension Reduction
- Authors: Yiyuan She, Jiahui Shen, Chao Zhang
- Abstract summary: This paper proposes a novel clustered reduced-rank learning framework.
It imposes two joint matrix regularizations to automatically group the features in constructing predictive factors.
It is more interpretable than low-rank modeling and relaxes the stringent sparsity assumption in variable selection.
- Score: 7.093830786026851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern high-dimensional methods often adopt the ``bet on sparsity''
principle, while in supervised multivariate learning statisticians may face
``dense'' problems with a large number of nonzero coefficients. This paper
proposes a novel clustered reduced-rank learning (CRL) framework that imposes
two joint matrix regularizations to automatically group the features in
constructing predictive factors. CRL is more interpretable than low-rank
modeling and relaxes the stringent sparsity assumption in variable selection.
In this paper, new information-theoretical limits are presented to reveal the
intrinsic cost of seeking for clusters, as well as the blessing from
dimensionality in multivariate learning. Moreover, an efficient optimization
algorithm is developed, which performs subspace learning and clustering with
guaranteed convergence. The obtained fixed-point estimators, though not
necessarily globally optimal, enjoy the desired statistical accuracy beyond the
standard likelihood setup under some regularity conditions. Moreover, a new
kind of information criterion, as well as its scale-free form, is proposed for
cluster and rank selection, and has a rigorous theoretical support without
assuming an infinite sample size. Extensive simulations and real-data
experiments demonstrate the statistical accuracy and interpretability of the
proposed method.
Related papers
- Learning Controlled Stochastic Differential Equations [61.82896036131116]
This work proposes a novel method for estimating both drift and diffusion coefficients of continuous, multidimensional, nonlinear controlled differential equations with non-uniform diffusion.
We provide strong theoretical guarantees, including finite-sample bounds for (L2), (Linfty), and risk metrics, with learning rates adaptive to coefficients' regularity.
Our method is available as an open-source Python library.
arXiv Detail & Related papers (2024-11-04T11:09:58Z) - Adaptive Transfer Clustering: A Unified Framework [2.3144964550307496]
We propose an adaptive transfer clustering (ATC) algorithm that automatically leverages the commonality in the presence of unknown discrepancy.
It applies to a broad class of statistical models including Gaussian mixture models, block models, and latent class models.
arXiv Detail & Related papers (2024-10-28T17:57:06Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Real Elliptically Skewed Distributions and Their Application to Robust
Cluster Analysis [5.137336092866906]
This article proposes a new class of Really Skewed (RESK) distributions and associated clustering algorithms.
Non-symmetrically distributed and heavy-tailed data clusters have been reported in a variety of real-world applications.
arXiv Detail & Related papers (2020-06-30T10:44:39Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Statistically Guided Divide-and-Conquer for Sparse Factorization of
Large Matrix [2.345015036605934]
We formulate the statistical problem as a sparse factor regression and tackle it with a divide-conquer approach.
In the first stage division, we consider both latent parallel approaches for simplifying the task into a set of co-parsesparserank estimation (CURE) problems.
In the second stage division, we innovate a stagewise learning technique, consisting of a sequence simple incremental paths, to efficiently trace out the whole solution of CURE.
arXiv Detail & Related papers (2020-03-17T19:12:21Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z) - Selective machine learning of doubly robust functionals [6.880360838661036]
We propose a selective machine learning framework for making inferences about a finite-dimensional functional defined on a semiparametric model.
We introduce a new selection criterion aimed at bias reduction in estimating the functional of interest based on a novel definition of pseudo-risk.
arXiv Detail & Related papers (2019-11-05T19:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.