PPFS: Predictive Permutation Feature Selection
- URL: http://arxiv.org/abs/2110.10713v1
- Date: Wed, 20 Oct 2021 18:18:18 GMT
- Title: PPFS: Predictive Permutation Feature Selection
- Authors: Atif Hassan, Jiaul H. Paik, Swanand Khare and Syed Asif Hassan
- Abstract summary: We propose a novel wrapper-based feature selection method based on the concept of Markov Blanket (MB)
Unlike previous MB methods, PPFS is a universal feature selection technique as it can work for both classification and regression tasks.
We propose Predictive Permutation Independence (PPI), a new Conditional Independence (CI) test, which enables PPFS to be categorised as a wrapper feature selection method.
- Score: 2.502407331311937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Predictive Permutation Feature Selection (PPFS), a novel
wrapper-based feature selection method based on the concept of Markov Blanket
(MB). Unlike previous MB methods, PPFS is a universal feature selection
technique as it can work for both classification as well as regression tasks on
datasets containing categorical and/or continuous features. We propose
Predictive Permutation Independence (PPI), a new Conditional Independence (CI)
test, which enables PPFS to be categorised as a wrapper feature selection
method. This is in contrast to current filter based MB feature selection
techniques that are unable to harness the advancements in supervised algorithms
such as Gradient Boosting Machines (GBM). The PPI test is based on the knockoff
framework and utilizes supervised algorithms to measure the association between
an individual or a set of features and the target variable. We also propose a
novel MB aggregation step that addresses the issue of sample inefficiency.
Empirical evaluations and comparisons on a large number of datasets demonstrate
that PPFS outperforms state-of-the-art Markov blanket discovery algorithms as
well as, well-known wrapper methods. We also provide a sketch of the proof of
correctness of our method. Implementation of this work is available at
\url{https://github.com/atif-hassan/PyImpetus}
Related papers
- Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - Class Probability Matching Using Kernel Methods for Label Shift
Adaptation [10.926835355554553]
We propose a new framework called textitclass probability matching (textitCPM) for label shift adaptation.
By incorporating the kernel logistic regression into the CPM framework to estimate the conditional probability, we propose an algorithm called textitCPMKM for label shift adaptation.
arXiv Detail & Related papers (2023-12-12T13:59:37Z) - Graph-Based Automatic Feature Selection for Multi-Class Classification
via Mean Simplified Silhouette [4.786337974720721]
This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS)
The method determines the minimum combination of features required to sustain prediction performance.
It does not require any user-defined parameters such as the number of features to select.
arXiv Detail & Related papers (2023-09-05T14:37:31Z) - AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation [64.9230895853942]
Domain generalization can be arbitrarily hard without exploiting target domain information.
Test-time adaptive (TTA) methods are proposed to address this issue.
In this work, we adopt Non-Parametric to perform the test-time Adaptation (AdaNPC)
arXiv Detail & Related papers (2023-04-25T04:23:13Z) - ManiFeSt: Manifold-based Feature Selection for Small Data Sets [9.649457851261909]
We present a new method for few-sample supervised feature selection (FS)
Our method first learns the manifold of the feature space of each class using kernels capturing multi-feature associations.
We show that our FS leads to improved classification and better generalization when applied to test data.
arXiv Detail & Related papers (2022-07-18T12:58:01Z) - Probabilistic Permutation Graph Search: Black-Box Optimization for
Fairness in Ranking [53.94413894017409]
We present a novel way of representing permutation distributions, based on the notion of permutation graphs.
Similar to PL, our distribution representation, called PPG, can be used for black-box optimization of fairness.
arXiv Detail & Related papers (2022-04-28T20:38:34Z) - Parallel feature selection based on the trace ratio criterion [4.30274561163157]
This work presents a novel parallel feature selection approach for classification, namely Parallel Feature Selection using Trace criterion (PFST)
Our method uses trace criterion, a measure of class separability used in Fisher's Discriminant Analysis, to evaluate feature usefulness.
The experiments show that our method can produce a small set of features in a fraction of the amount of time by the other methods under comparison.
arXiv Detail & Related papers (2022-03-03T10:50:33Z) - Markov Decision Process modeled with Bandits for Sequential Decision
Making in Linear-flow [73.1896399783641]
In membership/subscriber acquisition and retention, we sometimes need to recommend marketing content for multiple pages in sequence.
We propose to formulate the problem as an MDP with Bandits where Bandits are employed to model the transition probability matrix.
We observe the proposed MDP with Bandits algorithm outperforms Q-learning with $epsilon$-greedy and decreasing $epsilon$, independent Bandits, and interaction Bandits.
arXiv Detail & Related papers (2021-07-01T03:54:36Z) - Feature Selection Methods for Cost-Constrained Classification in Random
Forests [3.4806267677524896]
Cost-sensitive feature selection describes a feature selection problem, where features raise individual costs for inclusion in a model.
Random Forests define a particularly challenging problem for feature selection, as features are generally entangled in an ensemble of multiple trees.
We propose Shallow Tree Selection, a novel fast and multivariate feature selection method that selects features from small tree structures.
arXiv Detail & Related papers (2020-08-14T11:39:52Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Infinite Feature Selection: A Graph-based Feature Filtering Approach [78.63188057505012]
We propose a filtering feature selection framework that considers subsets of features as paths in a graph.
Going to infinite allows to constrain the computational complexity of the selection process.
We show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori.
arXiv Detail & Related papers (2020-06-15T07:20:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.