Stable Sparse Subspace Embedding for Dimensionality Reduction
- URL: http://arxiv.org/abs/2002.02844v1
- Date: Fri, 7 Feb 2020 15:30:22 GMT
- Title: Stable Sparse Subspace Embedding for Dimensionality Reduction
- Authors: Li Chen, Shuizheng Zhou, Jiajun Ma
- Abstract summary: This paper builds a stable sparse subspace embedded matrix based on random sampling without replacement in statistics.
It is proved that the S-SSE is stabler than the existing matrix, and it can maintain Euclidean distance between points well after dimension reduction.
- Score: 9.033485533901658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse random projection (RP) is a popular tool for dimensionality reduction
that shows promising performance with low computational complexity. However, in
the existing sparse RP matrices, the positions of non-zero entries are usually
randomly selected. Although they adopt uniform sampling with replacement, due
to large sampling variance, the number of non-zeros is uneven among rows of the
projection matrix which is generated in one trial, and more data information
may be lost after dimension reduction. To break this bottleneck, based on
random sampling without replacement in statistics, this paper builds a stable
sparse subspace embedded matrix (S-SSE), in which non-zeros are uniformly
distributed. It is proved that the S-SSE is stabler than the existing matrix,
and it can maintain Euclidean distance between points well after dimension
reduction. Our empirical studies corroborate our theoretical findings and
demonstrate that our approach can indeed achieve satisfactory performance.
Related papers
- Entry-Specific Bounds for Low-Rank Matrix Completion under Highly
Non-Uniform Sampling [10.824999179337558]
We show that it is often better and sometimes optimal to run estimation algorithms on a smaller submatrix rather than the entire matrix.
Our bounds characterize the hardness of estimating each entry as a function of the localized sampling probabilities.
arXiv Detail & Related papers (2024-02-29T23:24:43Z) - Optimal Projections for Discriminative Dictionary Learning using the JL-lemma [0.5461938536945723]
Dimensionality reduction-based dictionary learning methods have often used iterative random projections.
This paper proposes a constructive approach to derandomize the projection matrix using the Johnson-Lindenstrauss lemma.
arXiv Detail & Related papers (2023-08-27T02:59:59Z) - Recovering Simultaneously Structured Data via Non-Convex Iteratively
Reweighted Least Squares [0.8702432681310401]
We propose a new algorithm for recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations.
We show that the IRLS method favorable in identifying low/comckuele state measurements.
arXiv Detail & Related papers (2023-06-08T06:35:47Z) - On the Size and Approximation Error of Distilled Sets [57.61696480305911]
We take a theoretical view on kernel ridge regression based methods of dataset distillation such as Kernel Inducing Points.
We prove that a small set of instances exists in the original input space such that its solution in the RFF space coincides with the solution of the original data.
A KRR solution can be generated using this distilled set of instances which gives an approximation towards the KRR solution optimized on the full input data.
arXiv Detail & Related papers (2023-05-23T14:37:43Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Tensor Random Projection for Low Memory Dimension Reduction [22.715952036307648]
Random projections reduce the dimension of a set of vectors while preserving structural information.
This paper proposes a novel use of row-product random matrices in random projection.
It requires substantially less memory than existing dimension reduction maps.
arXiv Detail & Related papers (2021-04-30T22:08:04Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Spatially Adaptive Inference with Stochastic Feature Sampling and
Interpolation [72.40827239394565]
We propose to compute features only at sparsely sampled locations.
We then densely reconstruct the feature map with an efficient procedure.
The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
arXiv Detail & Related papers (2020-03-19T15:36:31Z) - Improved Subsampled Randomized Hadamard Transform for Linear SVM [18.52747917850984]
We propose importance sampling and deterministic top-$r$ sampling to produce effective low-dimensional embedding instead of uniform sampling SRHT.
Our experimental results have demonstrated that our proposed methods can achieve higher classification accuracies than SRHT and other random projection methods on six real-life datasets.
arXiv Detail & Related papers (2020-02-05T04:09:23Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.