Effects of Parametric and Non-Parametric Methods on High Dimensional
Sparse Matrix Representations
- URL: http://arxiv.org/abs/2202.02894v1
- Date: Mon, 7 Feb 2022 00:16:42 GMT
- Title: Effects of Parametric and Non-Parametric Methods on High Dimensional
Sparse Matrix Representations
- Authors: Sayali Tambe, Raunak Joshi, Abhishek Gupta, Nandan Kanvinde, Vidya
Chitre
- Abstract summary: The semantics are derived from textual data that provide representations for Machine Learning algorithms.
Since learning methods are broadly classified as parametric and non-parametric learning methods, in this paper we provide the effects of these type of algorithms on the high dimensional sparse matrix representations.
- Score: 2.719418335747252
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The semantics are derived from textual data that provide representations for
Machine Learning algorithms. These representations are interpretable form of
high dimensional sparse matrix that are given as an input to the machine
learning algorithms. Since learning methods are broadly classified as
parametric and non-parametric learning methods, in this paper we provide the
effects of these type of algorithms on the high dimensional sparse matrix
representations. In order to derive the representations from the text data, we
have considered TF-IDF representation with valid reason in the paper. We have
formed representations of 50, 100, 500, 1000 and 5000 dimensions respectively
over which we have performed classification using Linear Discriminant Analysis
and Naive Bayes as parametric learning method, Decision Tree and Support Vector
Machines as non-parametric learning method. We have later provided the metrics
on every single dimension of the representation and effect of every single
algorithm detailed in this paper.
Related papers
- Manifold learning: what, how, and why [2.681437069928767]
Manifold learning (ML) is a set of methods to find the low dimensional structure of data.
The new representations and descriptors obtained by ML reveal the geometric shape of high dimensional point clouds.
This survey presents the principles underlying ML, the representative methods, as well as their statistical foundations from a practicing statistician's perspective.
arXiv Detail & Related papers (2023-11-07T06:44:20Z) - Exploring ordered patterns in the adjacency matrix for improving machine
learning on complex networks [0.0]
The proposed methodology employs a sorting algorithm to rearrange the elements of the adjacency matrix of a complex graph in a specific order.
The resulting sorted adjacency matrix is then used as input for feature extraction and machine learning algorithms to classify the networks.
arXiv Detail & Related papers (2023-01-20T00:01:23Z) - Understanding High Dimensional Spaces through Visual Means Employing
Multidimensional Projections [0.0]
Two of the relevant algorithms in the data visualisation field are t-distributed neighbourhood embedding (t-SNE) and Least-Square Projection (LSP)
These algorithms can be used to understand several ranges of mathematical functions including their impact on datasets.
We illustrate ways of employing the visual results of multidimensional projection algorithms to understand and fine-tune the parameters of their mathematical framework.
arXiv Detail & Related papers (2022-07-12T20:30:33Z) - Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning.
At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space.
We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z) - Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting.
Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix.
We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Learning Linearized Assignment Flows for Image Labeling [70.540936204654]
We introduce a novel algorithm for estimating optimal parameters of linearized assignment flows for image labeling.
We show how to efficiently evaluate this formula using a Krylov subspace and a low-rank approximation.
arXiv Detail & Related papers (2021-08-02T13:38:09Z) - Learning Log-Determinant Divergences for Positive Definite Matrices [47.61701711840848]
In this paper, we propose to learn similarity measures in a data-driven manner.
We capitalize on the alphabeta-log-det divergence, which is a meta-divergence parametrized by scalars alpha and beta.
Our key idea is to cast these parameters in a continuum and learn them from data.
arXiv Detail & Related papers (2021-04-13T19:09:43Z) - Probabilistic Learning Vector Quantization on Manifold of Symmetric
Positive Definite Matrices [3.727361969017079]
We develop a new classification method for manifold-valued data in the framework of probabilistic learning vector quantization.
In this paper, we generalize the probabilistic learning vector quantization algorithm for data points living on the manifold of symmetric positive definite matrices.
Empirical investigations on synthetic data, image data, and motor imagery EEG data demonstrate the superior performance of the proposed method.
arXiv Detail & Related papers (2021-02-01T06:58:39Z) - Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics.
The proposed approach is a nonparametric generalization of the sufficient dimension reduction method.
We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.