Sharp-SSL: Selective high-dimensional axis-aligned random projections
for semi-supervised learning
- URL: http://arxiv.org/abs/2304.09154v1
- Date: Tue, 18 Apr 2023 17:49:02 GMT
- Title: Sharp-SSL: Selective high-dimensional axis-aligned random projections
for semi-supervised learning
- Authors: Tengyao Wang, Edgar Dobriban, Milana Gataric and Richard J. Samworth
- Abstract summary: We propose a new method for high-dimensional semi-supervised learning problems.
It is based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data.
- Score: 16.673022545571566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new method for high-dimensional semi-supervised learning
problems based on the careful aggregation of the results of a low-dimensional
procedure applied to many axis-aligned random projections of the data. Our
primary goal is to identify important variables for distinguishing between the
classes; existing low-dimensional methods can then be applied for final class
assignment. Motivated by a generalized Rayleigh quotient, we score projections
according to the traces of the estimated whitened between-class covariance
matrices on the projected data. This enables us to assign an importance weight
to each variable for a given projection, and to select our signal variables by
aggregating these weights over high-scoring projections. Our theory shows that
the resulting Sharp-SSL algorithm is able to recover the signal coordinates
with high probability when we aggregate over sufficiently many random
projections and when the base procedure estimates the whitened between-class
covariance matrix sufficiently well. The Gaussian EM algorithm is a natural
choice as a base procedure, and we provide a new analysis of its performance in
semi-supervised settings that controls the parameter estimation error in terms
of the proportion of labeled data in the sample. Numerical results on both
simulated data and a real colon tumor dataset support the excellent empirical
performance of the method.
Related papers
- Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance.
We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks.
We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z) - Shuffled Linear Regression via Spectral Matching [6.24954299842136]
Shuffled linear regression seeks to estimate latent features through a linear transformation.
This problem extends traditional least-squares (LS) and Least Absolute Shrinkage and Selection Operator (LASSO) approaches.
We propose a spectral matching method that efficiently resolves permutations.
arXiv Detail & Related papers (2024-09-30T16:26:40Z) - Optimal Projections for Discriminative Dictionary Learning using the JL-lemma [0.5461938536945723]
Dimensionality reduction-based dictionary learning methods have often used iterative random projections.
This paper proposes a constructive approach to derandomize the projection matrix using the Johnson-Lindenstrauss lemma.
arXiv Detail & Related papers (2023-08-27T02:59:59Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Multilevel orthogonal Bochner function subspaces with applications to
robust machine learning [1.533771872970755]
We consider the data as instances of a random field within a relevant Bochner space.
Our key observation is that the classes can predominantly reside in two distinct subspaces.
arXiv Detail & Related papers (2021-10-04T22:01:01Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Expectation propagation on the diluted Bayesian classifier [0.0]
We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification.
A computational scheme known as expectation propagation (EP) is used to train a continuous-weights perceptron learning a classification rule.
EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity.
arXiv Detail & Related papers (2020-09-20T23:59:44Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.