Sharp-SSL: Selective high-dimensional axis-aligned random projections
for semi-supervised learning
- URL: http://arxiv.org/abs/2304.09154v1
- Date: Tue, 18 Apr 2023 17:49:02 GMT
- Title: Sharp-SSL: Selective high-dimensional axis-aligned random projections
for semi-supervised learning
- Authors: Tengyao Wang, Edgar Dobriban, Milana Gataric and Richard J. Samworth
- Abstract summary: We propose a new method for high-dimensional semi-supervised learning problems.
It is based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data.
- Score: 16.673022545571566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new method for high-dimensional semi-supervised learning
problems based on the careful aggregation of the results of a low-dimensional
procedure applied to many axis-aligned random projections of the data. Our
primary goal is to identify important variables for distinguishing between the
classes; existing low-dimensional methods can then be applied for final class
assignment. Motivated by a generalized Rayleigh quotient, we score projections
according to the traces of the estimated whitened between-class covariance
matrices on the projected data. This enables us to assign an importance weight
to each variable for a given projection, and to select our signal variables by
aggregating these weights over high-scoring projections. Our theory shows that
the resulting Sharp-SSL algorithm is able to recover the signal coordinates
with high probability when we aggregate over sufficiently many random
projections and when the base procedure estimates the whitened between-class
covariance matrix sufficiently well. The Gaussian EM algorithm is a natural
choice as a base procedure, and we provide a new analysis of its performance in
semi-supervised settings that controls the parameter estimation error in terms
of the proportion of labeled data in the sample. Numerical results on both
simulated data and a real colon tumor dataset support the excellent empirical
performance of the method.
Related papers
- Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Optimal Discriminant Analysis in High-Dimensional Latent Factor Models [1.4213973379473654]
In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space.
We formulate a latent-variable model with a hidden low-dimensional structure to justify this two-step procedure.
We propose a computationally efficient classifier that takes certain principal components (PCs) of the observed features as projections.
arXiv Detail & Related papers (2022-10-23T21:45:53Z) - Likelihood Adjusted Semidefinite Programs for Clustering Heterogeneous
Data [16.153709556346417]
Clustering is a widely deployed learning tool.
iLA-SDP is less sensitive than EM to and more stable on high-dimensional data.
arXiv Detail & Related papers (2022-09-29T21:03:13Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Multilevel orthogonal Bochner function subspaces with applications to
robust machine learning [1.533771872970755]
We consider the data as instances of a random field within a relevant Bochner space.
Our key observation is that the classes can predominantly reside in two distinct subspaces.
arXiv Detail & Related papers (2021-10-04T22:01:01Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Expectation propagation on the diluted Bayesian classifier [0.0]
We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification.
A computational scheme known as expectation propagation (EP) is used to train a continuous-weights perceptron learning a classification rule.
EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity.
arXiv Detail & Related papers (2020-09-20T23:59:44Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.