Related papers: DPDR: A novel machine learning method for the Decision Process for Dimensionality Reduction

DPDR: A novel machine learning method for the Decision Process for Dimensionality Reduction

URL: http://arxiv.org/abs/2206.08974v1
Date: Fri, 17 Jun 2022 19:14:39 GMT
Title: DPDR: A novel machine learning method for the Decision Process for Dimensionality Reduction
Authors: Jean-S\'ebastien Dessureault and Daniel Massicotte
Abstract summary: It is often confusing to find a suitable method to reduce dimensionality in a supervised learning context. This paper proposes a new method to choose the best dimensionality reduction method in a supervised learning context. The main algorithms used are the Random Forest algorithms (RF), the Principal Component Analysis (PCA) algorithm, and the multilayer perceptron (MLP) neural network algorithm.
Score: 1.827510863075184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper discusses the critical decision process of extracting or selecting the features in a supervised learning context. It is often confusing to find a suitable method to reduce dimensionality. There are pros and cons to deciding between a feature selection and feature extraction according to the data's nature and the user's preferences. Indeed, the user may want to emphasize the results toward integrity or interpretability and a specific data resolution. This paper proposes a new method to choose the best dimensionality reduction method in a supervised learning context. It also helps to drop or reconstruct the features until a target resolution is reached. This target resolution can be user-defined, or it can be automatically defined by the method. The method applies a regression or a classification, evaluates the results, and gives a diagnosis about the best dimensionality reduction process in this specific supervised learning context. The main algorithms used are the Random Forest algorithms (RF), the Principal Component Analysis (PCA) algorithm, and the multilayer perceptron (MLP) neural network algorithm. Six use cases are presented, and every one is based on some well-known technique to generate synthetic data. This research discusses each choice that can be made in the process, aiming to clarify the issues about the entire decision process of selecting or extracting the features.

Related papers

Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks. We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations. We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z)
Ideal Abstractions for Decision-Focused Learning [108.15241246054515]
We propose a method that configures the output space automatically in order to minimize the loss of decision-relevant information. We demonstrate the method in two domains: data acquisition for deep neural network training and a closed-loop wildfire management task.
arXiv Detail & Related papers (2023-03-29T23:31:32Z)
R(Det)^2: Randomized Decision Routing for Object Detection [64.48369663018376]
We propose a novel approach to combine decision trees and deep neural networks in an end-to-end learning manner for object detection. To facilitate effective learning, we propose randomized decision routing with node selective and associative losses. We name this approach as the randomized decision routing for object detection, abbreviated as R(Det)$2$.
arXiv Detail & Related papers (2022-04-02T07:54:58Z)
Feature selection or extraction decision process for clustering using PCA and FRSD [2.6803492658436032]
This paper proposes a new method to choose the best dimensionality reduction method (selection or extraction) according to the data scientist's parameters. It uses Feature Ranking Process Based on Silhouette Decomposition (FRSD) algorithm, a Principal Component Analysis (PCA) algorithm, and a K-Means algorithm along with its metric, the Silhouette Index (SI)
arXiv Detail & Related papers (2021-11-20T01:40:54Z)
Regret Analysis in Deterministic Reinforcement Learning [78.31410227443102]
We study the problem of regret, which is central to the analysis and design of optimal learning algorithms. We present logarithmic problem-specific regret lower bounds that explicitly depend on the system parameter.
arXiv Detail & Related papers (2021-06-27T23:41:57Z)
A concise method for feature selection via normalized frequencies [0.0]
In this paper, a concise method is proposed for universal feature selection. The proposed method uses a fusion of the filter method and the wrapper method, rather than a combination of them. The evaluation results show that the proposed method outperformed several state-of-the-art related works in terms of accuracy, precision, recall, F-score and AUC.
arXiv Detail & Related papers (2021-06-10T15:29:54Z)
Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning. Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z)
Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
IVFS: Simple and Efficient Feature Selection for High Dimensional Topology Preservation [33.424663018395684]
We propose a simple and effective feature selection algorithm to enhance sample similarity preservation. The proposed algorithm is able to well preserve the pairwise distances, as well as topological patterns, of the full data.
arXiv Detail & Related papers (2020-04-02T23:05:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.