Related papers: Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

URL: http://arxiv.org/abs/2304.09154v1
Date: Tue, 18 Apr 2023 17:49:02 GMT
Title: Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning
Authors: Tengyao Wang, Edgar Dobriban, Milana Gataric and Richard J. Samworth
Abstract summary: We propose a new method for high-dimensional semi-supervised learning problems. It is based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data.
Score: 16.673022545571566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data. Our primary goal is to identify important variables for distinguishing between the classes; existing low-dimensional methods can then be applied for final class assignment. Motivated by a generalized Rayleigh quotient, we score projections according to the traces of the estimated whitened between-class covariance matrices on the projected data. This enables us to assign an importance weight to each variable for a given projection, and to select our signal variables by aggregating these weights over high-scoring projections. Our theory shows that the resulting Sharp-SSL algorithm is able to recover the signal coordinates with high probability when we aggregate over sufficiently many random projections and when the base procedure estimates the whitened between-class covariance matrix sufficiently well. The Gaussian EM algorithm is a natural choice as a base procedure, and we provide a new analysis of its performance in semi-supervised settings that controls the parameter estimation error in terms of the proportion of labeled data in the sample. Numerical results on both simulated data and a real colon tumor dataset support the excellent empirical performance of the method.

Related papers

Recovering Imbalanced Clusters via Gradient-Based Projection Pursuit [7.141484637056533]
We propose a method for recovering projections containing either Imbalanced Clusters or a Bernoulli-Rademacher distribution. We analyze our algorithm's sample complexity within a Planted Vector setting where we can observe that Imbalanced Clusters can be recovered more easily than balanced ones. We experimentally evaluate our method's applicability to real-world data using FashionMNIST and the Human Activity Recognition dataset.
arXiv Detail & Related papers (2025-02-04T19:18:17Z)
Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks. We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z)
Shuffled Linear Regression via Spectral Matching [6.24954299842136]
Shuffled linear regression seeks to estimate latent features through a linear transformation. This problem extends traditional least-squares (LS) and Least Absolute Shrinkage and Selection Operator (LASSO) approaches. We propose a spectral matching method that efficiently resolves permutations.
arXiv Detail & Related papers (2024-09-30T16:26:40Z)
Optimal Projections for Discriminative Dictionary Learning using the JL-lemma [0.5461938536945723]
Dimensionality reduction-based dictionary learning methods have often used iterative random projections. This paper proposes a constructive approach to derandomize the projection matrix using the Johnson-Lindenstrauss lemma.
arXiv Detail & Related papers (2023-08-27T02:59:59Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer. With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices. We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z)
Multilevel orthogonal Bochner function subspaces with applications to robust machine learning [1.533771872970755]
We consider the data as instances of a random field within a relevant Bochner space. Our key observation is that the classes can predominantly reside in two distinct subspaces.
arXiv Detail & Related papers (2021-10-04T22:01:01Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning. Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch. ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)
Expectation propagation on the diluted Bayesian classifier [0.0]
We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification. A computational scheme known as expectation propagation (EP) is used to train a continuous-weights perceptron learning a classification rule. EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity.
arXiv Detail & Related papers (2020-09-20T23:59:44Z)
Asymptotic Analysis of an Ensemble of Randomly Projected Linear Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets. We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator. We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.