Joint Dimensionality Reduction for Separable Embedding Estimation
- URL: http://arxiv.org/abs/2101.05500v1
- Date: Thu, 14 Jan 2021 08:48:37 GMT
- Title: Joint Dimensionality Reduction for Separable Embedding Estimation
- Authors: Yanjun Li, Bihan Wen, Hao Cheng and Yoram Bresler
- Abstract summary: Low-dimensional embeddings for data from disparate sources play critical roles in machine learning, multimedia information retrieval, and bioinformatics.
We propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.
Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.
- Score: 43.22422640265388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-dimensional embeddings for data from disparate sources play critical
roles in multi-modal machine learning, multimedia information retrieval, and
bioinformatics. In this paper, we propose a supervised dimensionality reduction
method that learns linear embeddings jointly for two feature vectors
representing data of different modalities or data from distinct types of
entities. We also propose an efficient feature selection method that
complements, and can be applied prior to, our joint dimensionality reduction
method. Assuming that there exist true linear embeddings for these features,
our analysis of the error in the learned linear embeddings provides theoretical
guarantees that the dimensionality reduction method accurately estimates the
true embeddings when certain technical conditions are satisfied and the number
of samples is sufficiently large. The derived sample complexity results are
echoed by numerical experiments. We apply the proposed dimensionality reduction
method to gene-disease association, and predict unknown associations using
kernel regression on the dimension-reduced feature vectors. Our approach
compares favorably against other dimensionality reduction methods, and against
a state-of-the-art method of bilinear regression for predicting gene-disease
associations.
Related papers
- Dimension reduction via score ratio matching [0.9012198585960441]
We propose a framework, derived from score-matching, to extend gradient-based dimension reduction to problems where gradients are unavailable.
We show that our approach outperforms standard score-matching for problems with low-dimensional structure.
arXiv Detail & Related papers (2024-10-25T22:21:03Z) - NeurAM: nonlinear dimensionality reduction for uncertainty quantification through neural active manifolds [0.6990493129893112]
We leverage autoencoders to discover a one-dimensional neural active manifold (NeurAM) capturing the model output variability.
We show how NeurAM can be used to obtain multifidelity sampling estimators with reduced variance.
arXiv Detail & Related papers (2024-08-07T04:27:58Z) - Nonlinear Feature Aggregation: Two Algorithms driven by Theory [45.3190496371625]
Real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues.
We propose a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function.
We also test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.
arXiv Detail & Related papers (2023-06-19T19:57:33Z) - Coordinated Double Machine Learning [8.808993671472349]
This paper argues that a carefully coordinated learning algorithm for deep neural networks may reduce the estimation bias.
The improved empirical performance of the proposed method is demonstrated through numerical experiments on both simulated and real data.
arXiv Detail & Related papers (2022-06-02T05:56:21Z) - Efficient Multidimensional Functional Data Analysis Using Marginal
Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data.
We show that the resulting estimation problem can be solved efficiently by the tensor decomposition.
We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics.
The proposed approach is a nonparametric generalization of the sufficient dimension reduction method.
We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z) - Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering [50.43424130281065]
We propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF.
It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step.
arXiv Detail & Related papers (2020-05-19T05:54:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.