A Subspace-based Approach for Dimensionality Reduction and Important
Variable Selection
- URL: http://arxiv.org/abs/2106.01584v1
- Date: Thu, 3 Jun 2021 04:10:34 GMT
- Title: A Subspace-based Approach for Dimensionality Reduction and Important
Variable Selection
- Authors: Di Bo, Hoon Hwangbo, Vinit Sharma, Corey Arndt, Stephanie C. TerMaath
- Abstract summary: This research proposes a new method that produces subspaces, reduced-dimensional physical spaces, based on a randomized search.
When applied to high-dimensional data collected from a composite metal development process, the proposed method shows its superiority in prediction and important variable selection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An analysis of high dimensional data can offer a detailed description of a
system but is often challenged by the curse of dimensionality. General
dimensionality reduction techniques can alleviate such difficulty by extracting
a few important features, but they are limited due to the lack of
interpretability and connectivity to actual decision making associated with
each physical variable. Important variable selection techniques, as an
alternative, can maintain the interpretability, but they often involve a greedy
search that is susceptible to failure in capturing important interactions. This
research proposes a new method that produces subspaces, reduced-dimensional
physical spaces, based on a randomized search and forms an ensemble of models
for critical subspaces. When applied to high-dimensional data collected from a
composite metal development process, the proposed method shows its superiority
in prediction and important variable selection.
Related papers
- Simultaneous Dimensionality Reduction for Extracting Useful Representations of Large Empirical Multimodal Datasets [0.0]
We focus on the sciences of dimensionality reduction as a means to obtain low-dimensional descriptions from high-dimensional data.
We address the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within systems or high-dimensional dynamical systems.
arXiv Detail & Related papers (2024-10-23T21:27:40Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Selecting Features by their Resilience to the Curse of Dimensionality [0.0]
Real-world datasets are often of high dimension and effected by the curse of dimensionality.
Here we step in with a novel method that identifies the features that allow to discriminate data subsets of different sizes.
Our experiments show that our method is competitive and commonly outperforms established feature selection methods.
arXiv Detail & Related papers (2023-04-05T14:26:23Z) - Interpretable Linear Dimensionality Reduction based on Bias-Variance
Analysis [45.3190496371625]
We propose a principled dimensionality reduction approach that maintains the interpretability of the resulting features.
In this way, all features are considered, the dimensionality is reduced and the interpretability is preserved.
arXiv Detail & Related papers (2023-03-26T14:30:38Z) - DimenFix: A novel meta-dimensionality reduction method for feature
preservation [64.0476282000118]
We propose a novel meta-method, DimenFix, which can be operated upon any base dimensionality reduction method that involves a gradient-descent-like process.
By allowing users to define the importance of different features, which is considered in dimensionality reduction, DimenFix creates new possibilities to visualize and understand a given dataset.
arXiv Detail & Related papers (2022-11-30T05:35:22Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - A survey of unsupervised learning methods for high-dimensional
uncertainty quantification in black-box-type problems [0.0]
We construct surrogate models for quantification uncertainty (UQ) on complex partial differential equations (PPDEs)
The curse of dimensionality can be a pre-dimensional subspace used with suitable unsupervised learning techniques.
We demonstrate both the advantages and limitations of a suitable m-PCE model and we conclude that a suitable m-PCE model provides a cost-effective approach to deep subspaces.
arXiv Detail & Related papers (2022-02-09T16:33:40Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z) - Linear Tensor Projection Revealing Nonlinearity [0.294944680995069]
Dimensionality reduction is an effective method for learning high-dimensional data.
We propose a method that searches for a subspace that maximizes the prediction accuracy while retaining as much of the original data information as possible.
arXiv Detail & Related papers (2020-07-08T06:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.