An Experimental Study of Dimension Reduction Methods on Machine Learning
Algorithms with Applications to Psychometrics
- URL: http://arxiv.org/abs/2210.13230v3
- Date: Wed, 22 Mar 2023 03:33:32 GMT
- Title: An Experimental Study of Dimension Reduction Methods on Machine Learning
Algorithms with Applications to Psychometrics
- Authors: Sean H. Merritt and Alexander P. Christensen
- Abstract summary: We show that dimension reduction can decrease, increase, or provide the same accuracy as no reduction of variables.
Our tentative results find that dimension reduction tends to lead to better performance when used for classification tasks.
- Score: 77.34726150561087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing interpretable machine learning models has become an increasingly
important issue. One way in which data scientists have been able to develop
interpretable models has been to use dimension reduction techniques. In this
paper, we examine several dimension reduction techniques including two recent
approaches developed in the network psychometrics literature called exploratory
graph analysis (EGA) and unique variable analysis (UVA). We compared EGA and
UVA with two other dimension reduction techniques common in the machine
learning literature (principal component analysis and independent component
analysis) as well as no reduction to the variables real data. We show that EGA
and UVA perform as well as the other reduction techniques or no reduction.
Consistent with previous literature, we show that dimension reduction can
decrease, increase, or provide the same accuracy as no reduction of variables.
Our tentative results find that dimension reduction tends to lead to better
performance when used for classification tasks.
Related papers
- Exploring the Influence of Dimensionality Reduction on Anomaly Detection
Performance in Multivariate Time Series [0.9790236766474201]
The study involves a comprehensive evaluation across three different datasets: MSL, SMAP, and SWaT.
The dimensionality reduction techniques examined include PCA, UMAP, Random Projection, and t-SNE.
A remarkable reduction in training times was observed, with reductions by approximately 300% and 650% when dimensionality was halved.
arXiv Detail & Related papers (2024-03-07T11:59:00Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - An evaluation framework for dimensionality reduction through sectional
curvature [59.40521061783166]
In this work, we aim to introduce the first highly non-supervised dimensionality reduction performance metric.
To test its feasibility, this metric has been used to evaluate the performance of the most commonly used dimension reduction algorithms.
A new parameterized problem instance generator has been constructed in the form of a function generator.
arXiv Detail & Related papers (2023-03-17T11:59:33Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - Understanding Incremental Learning of Gradient Descent: A Fine-grained
Analysis of Matrix Sensing [74.2952487120137]
It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in machine learning models.
This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem.
arXiv Detail & Related papers (2023-01-27T02:30:51Z) - A Brief Survey on Representation Learning based Graph Dimensionality
Reduction Techniques [0.0]
Dimensionality reduction techniques map data represented on higher dimensions onto lower dimensions with varying degrees of information loss.
There exist several techniques that are efficient at generating embeddings from graph data and projecting them onto low dimensional latent spaces.
We present this survey to outline the benefits as well as problems associated with the existing graph dimensionality reduction techniques.
arXiv Detail & Related papers (2022-10-13T04:29:24Z) - Exploring Dimensionality Reduction Techniques in Multilingual
Transformers [64.78260098263489]
This paper gives a comprehensive account of the impact of dimensional reduction techniques on the performance of state-of-the-art multilingual Siamese Transformers.
It shows that it is possible to achieve an average reduction in the number of dimensions of $91.58% pm 2.59%$ and $54.65% pm 32.20%$, respectively.
arXiv Detail & Related papers (2022-04-18T17:20:55Z) - Supervised Linear Dimension-Reduction Methods: Review, Extensions, and
Comparisons [6.71092092685492]
Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling.
This paper reviews selected techniques, extends some of them, and compares their performance through simulations.
Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study.
arXiv Detail & Related papers (2021-09-09T17:57:25Z) - Joint Dimensionality Reduction for Separable Embedding Estimation [43.22422640265388]
Low-dimensional embeddings for data from disparate sources play critical roles in machine learning, multimedia information retrieval, and bioinformatics.
We propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.
Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.
arXiv Detail & Related papers (2021-01-14T08:48:37Z) - Longitudinal Variational Autoencoder [1.4680035572775534]
A common approach to analyse high-dimensional data that contains missing values is to learn a low-dimensional representation using variational autoencoders (VAEs)
Standard VAEs assume that the learnt representations are i.i.d., and fail to capture the correlations between the data samples.
We propose the Longitudinal VAE (L-VAE), that uses a multi-output additive Gaussian process (GP) prior to extend the VAE's capability to learn structured low-dimensional representations.
Our approach can simultaneously accommodate both time-varying shared and random effects, produce structured low-dimensional representations
arXiv Detail & Related papers (2020-06-17T10:30:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.