Related papers: Ensemble- and Distance-Based Feature Ranking for Unsupervised Learning

Ensemble- and Distance-Based Feature Ranking for Unsupervised Learning

URL: http://arxiv.org/abs/2011.11679v1
Date: Mon, 23 Nov 2020 19:17:24 GMT
Title: Ensemble- and Distance-Based Feature Ranking for Unsupervised Learning
Authors: Matej Petkovi\'c, Dragi Kocev, Bla\v{z} \v{S}krlj, Sa\v{s}o D\v{z}eroski
Abstract summary: We propose two novel (groups of) methods for unsupervised feature ranking and selection. The first group includes feature ranking scores (Genie3 score, RandomForest score) that are computed from ensembles of predictive clustering trees. The second method is URelief, the unsupervised extension of the Relief family of feature ranking algorithms.
Score: 2.7921429800866533
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this work, we propose two novel (groups of) methods for unsupervised feature ranking and selection. The first group includes feature ranking scores (Genie3 score, RandomForest score) that are computed from ensembles of predictive clustering trees. The second method is URelief, the unsupervised extension of the Relief family of feature ranking algorithms. Using 26 benchmark data sets and 5 baselines, we show that both the Genie3 score (computed from the ensemble of extra trees) and the URelief method outperform the existing methods and that Genie3 performs best overall, in terms of predictive power of the top-ranked features. Additionally, we analyze the influence of the hyper-parameters of the proposed methods on their performance, and show that for the Genie3 score the highest quality is achieved by the most efficient parameter configuration. Finally, we propose a way of discovering the location of the features in the ranking, which are the most relevant in reality.

Related papers

Foundations of the Theory of Performance-Based Ranking [10.89980029564174]
We establish the foundations of a universal theory for performance-based ranking. A universal parametric family of scores, called ranking scores, can be used to establish rankings satisfying our axioms. We show, in the case of two-class classification, that the family of ranking scores encompasses well-known performance scores.
arXiv Detail & Related papers (2024-12-05T15:05:25Z)
On the Evaluation Consistency of Attribution-based Explanations [42.1421504321572]
We introduce Meta-Rank, an open platform for benchmarking attribution methods in the image domain. Our benchmark reveals three insights in attribution evaluation endeavors: 1) evaluating attribution methods under disparate settings can yield divergent performance rankings; 2) although inconsistent across numerous cases, the performance rankings exhibit remarkable consistency across distinct checkpoints along the same training trajectory; and 3) prior attempts at consistent evaluation fare no better than baselines when extended to more heterogeneous models and datasets.
arXiv Detail & Related papers (2024-07-28T11:49:06Z)
RankSHAP: Shapley Value Based Feature Attributions for Learning to Rank [28.438428292619577]
We adopt an axiomatic game-theoretic approach, popular in the feature attribution community, to identify a set of fundamental axioms that every ranking-based feature attribution method should satisfy. We then introduce Rank-SHAP, extending classical Shapley values to ranking. We also perform an axiomatic analysis of existing rank attribution algorithms to determine their compliance with our proposed axioms.
arXiv Detail & Related papers (2024-05-03T04:43:24Z)
RankingSHAP -- Listwise Feature Attribution Explanations for Ranking Models [48.895510739010355]
We present three key contributions to address this gap. First, we rigorously define listwise feature attribution for ranking models. Second, we introduce RankingSHAP, extending the popular SHAP framework to accommodate listwise ranking attribution. Third, we propose two novel evaluation paradigms for assessing the faithfulness of attributions in learning-to-rank models.
arXiv Detail & Related papers (2024-03-24T10:45:55Z)
Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses. Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z)
Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking. xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics. We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z)
An Evolutionary Correlation-aware Feature Selection Method for Classification Problems [3.2550305883611244]
In this paper, an estimation of distribution algorithm is proposed to meet three goals. Firstly, as an extension of EDA, the proposed method generates only two individuals in each iteration that compete based on a fitness function. Secondly, we provide a guiding technique for determining the number of features for individuals in each iteration. As the main contribution of the paper, in addition to considering the importance of each feature alone, the proposed method can consider the interaction between features.
arXiv Detail & Related papers (2021-10-16T20:20:43Z)
Hierarchical Ranking for Answer Selection [19.379777219863964]
We propose a novel strategy for answer selection, called hierarchical ranking. We introduce three levels of ranking: point-level ranking, pair-level ranking, and list-level ranking. Experimental results on two public datasets, WikiQA and TREC-QA, demonstrate that the proposed hierarchical ranking is effective.
arXiv Detail & Related papers (2021-02-01T07:35:52Z)
Feature Importance Ranking for Deep Learning [7.287652818214449]
We propose a novel dual-net architecture consisting of operator and selector for discovery of an optimal feature subset of a fixed size. During learning, the operator is trained for a supervised learning task via optimal feature subset candidates generated by the selector. In deployment, the selector generates an optimal feature subset and ranks feature importance, while the operator makes predictions based on the optimal subset for test data.
arXiv Detail & Related papers (2020-10-18T12:20:27Z)
Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability. A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance. We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z)
Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data. There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups. We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
SetRank: A Setwise Bayesian Approach for Collaborative Ranking from Implicit Feedback [50.13745601531148]
We propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to accommodate the characteristics of implicit feedback in recommender system. Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons. We also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to $sqrtM/N$.
arXiv Detail & Related papers (2020-02-23T06:40:48Z)
PointHop++: A Lightweight Learning Model on Point Sets for 3D Classification [55.887502438160304]
The PointHop method was recently proposed by Zhang et al. for 3D point cloud classification with unsupervised feature extraction. We improve the PointHop method furthermore in two aspects: 1) reducing its model complexity in terms of the model parameter number and 2) ordering discriminant features automatically based on the cross-entropy criterion. With experiments conducted on the ModelNet40 benchmark dataset, we show that the PointHop++ method performs on par with deep neural network (DNN) solutions and surpasses other unsupervised feature extraction methods.
arXiv Detail & Related papers (2020-02-09T04:49:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.