Related papers: Effects of Parametric and Non-Parametric Methods on High Dimensional Sparse Matrix Representations

Effects of Parametric and Non-Parametric Methods on High Dimensional Sparse Matrix Representations

URL: http://arxiv.org/abs/2202.02894v1
Date: Mon, 7 Feb 2022 00:16:42 GMT
Title: Effects of Parametric and Non-Parametric Methods on High Dimensional Sparse Matrix Representations
Authors: Sayali Tambe, Raunak Joshi, Abhishek Gupta, Nandan Kanvinde, Vidya Chitre
Abstract summary: The semantics are derived from textual data that provide representations for Machine Learning algorithms. Since learning methods are broadly classified as parametric and non-parametric learning methods, in this paper we provide the effects of these type of algorithms on the high dimensional sparse matrix representations.
Score: 2.719418335747252
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The semantics are derived from textual data that provide representations for Machine Learning algorithms. These representations are interpretable form of high dimensional sparse matrix that are given as an input to the machine learning algorithms. Since learning methods are broadly classified as parametric and non-parametric learning methods, in this paper we provide the effects of these type of algorithms on the high dimensional sparse matrix representations. In order to derive the representations from the text data, we have considered TF-IDF representation with valid reason in the paper. We have formed representations of 50, 100, 500, 1000 and 5000 dimensions respectively over which we have performed classification using Linear Discriminant Analysis and Naive Bayes as parametric learning method, Decision Tree and Support Vector Machines as non-parametric learning method. We have later provided the metrics on every single dimension of the representation and effect of every single algorithm detailed in this paper.

Related papers

Manifold learning: what, how, and why [2.681437069928767]
Manifold learning (ML) is a set of methods to find the low dimensional structure of data. The new representations and descriptors obtained by ML reveal the geometric shape of high dimensional point clouds. This survey presents the principles underlying ML, the representative methods, as well as their statistical foundations from a practicing statistician's perspective.
arXiv Detail & Related papers (2023-11-07T06:44:20Z)
Optimal Projections for Discriminative Dictionary Learning using the JL-lemma [0.5461938536945723]
Dimensionality reduction-based dictionary learning methods have often used iterative random projections. This paper proposes a constructive approach to derandomize the projection matrix using the Johnson-Lindenstrauss lemma.
arXiv Detail & Related papers (2023-08-27T02:59:59Z)
Exploring ordered patterns in the adjacency matrix for improving machine learning on complex networks [0.0]
The proposed methodology employs a sorting algorithm to rearrange the elements of the adjacency matrix of a complex graph in a specific order. The resulting sorted adjacency matrix is then used as input for feature extraction and machine learning algorithms to classify the networks.
arXiv Detail & Related papers (2023-01-20T00:01:23Z)
Understanding High Dimensional Spaces through Visual Means Employing Multidimensional Projections [0.0]
Two of the relevant algorithms in the data visualisation field are t-distributed neighbourhood embedding (t-SNE) and Least-Square Projection (LSP) These algorithms can be used to understand several ranges of mathematical functions including their impact on datasets. We illustrate ways of employing the visual results of multidimensional projection algorithms to understand and fine-tune the parameters of their mathematical framework.
arXiv Detail & Related papers (2022-07-12T20:30:33Z)
Hyperbolic Vision Transformers: Combining Improvements in Metric Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning. At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space. We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z)
Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting. Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix. We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z)
Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression. It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise. This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z)
Learning Linearized Assignment Flows for Image Labeling [70.540936204654]
We introduce a novel algorithm for estimating optimal parameters of linearized assignment flows for image labeling. We show how to efficiently evaluate this formula using a Krylov subspace and a low-rank approximation.
arXiv Detail & Related papers (2021-08-02T13:38:09Z)
Learning Log-Determinant Divergences for Positive Definite Matrices [47.61701711840848]
In this paper, we propose to learn similarity measures in a data-driven manner. We capitalize on the alphabeta-log-det divergence, which is a meta-divergence parametrized by scalars alpha and beta. Our key idea is to cast these parameters in a continuum and learn them from data.
arXiv Detail & Related papers (2021-04-13T19:09:43Z)
Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics. The proposed approach is a nonparametric generalization of the sufficient dimension reduction method. We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.