Conformal Recursive Feature Elimination
- URL: http://arxiv.org/abs/2405.19429v1
- Date: Wed, 29 May 2024 18:10:36 GMT
- Title: Conformal Recursive Feature Elimination
- Authors: Marcos López-De-Castro, Alberto García-Galindo, Rubén Armañanzas,
- Abstract summary: Conformal Prediction (CP) allows for the determination of valid and accurate confidence levels associated with individual predictions.
We introduce a new feature selection method that takes advantage of the CP framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unlike traditional statistical methods, Conformal Prediction (CP) allows for the determination of valid and accurate confidence levels associated with individual predictions based only on exchangeability of the data. We here introduce a new feature selection method that takes advantage of the CP framework. Our proposal, named Conformal Recursive Feature Elimination (CRFE), identifies and recursively removes features that increase the non-conformity of a dataset. We also present an automatic stopping criterion for CRFE, as well as a new index to measure consistency between subsets of features. CRFE selections are compared to the classical Recursive Feature Elimination (RFE) method on several multiclass datasets by using multiple partitions of the data. The results show that CRFE clearly outperforms RFE in half of the datasets, while achieving similar performance in the rest. The automatic stopping criterion provides subsets of effective and non-redundant features without computing any classification performance.
Related papers
- Uncertainty-driven Embedding Convolution [12.284127272660982]
We propose Uncertainty-driven Embedding Convolution (UEC)<n>UEC transforms deterministic embeddings into probabilistic ones in a post-hoc manner.<n>It then computes adaptive ensemble weights based on embedding uncertainty, grounded in a Bayes-optimal solution under a surrogate loss.
arXiv Detail & Related papers (2025-07-28T11:15:25Z) - HCVR: A Hybrid Approach with Correlation-aware Voting Rules for Feature Selection [0.0]
HCVR (Hybrid approach with Correlation-aware Voting Rules) is a lightweight rule-based feature selection method.<n>It combines -to-one correlations to eliminate redundant features and relevant ones.<n>Results show improvement as compared to traditional non-iterative (CFS, mRMR and MI) and iterative (RFE, SFS and Genetic) techniques.
arXiv Detail & Related papers (2025-07-02T18:20:56Z) - Label-shift robust federated feature screening for high-dimensional classification [14.252760098879186]
This paper introduces a general framework that unifies existing screening methods and proposes a novel utility, label-shift robust federated feature screening (LR-FFS)<n>Building upon this framework, LR-FFS leverages conditional distribution functions and expectations to address label shift without adding computational burdens.<n> Experimental results and theoretical analyses demonstrate LR-FFS's superior performance across diverse client environments.
arXiv Detail & Related papers (2025-05-31T04:14:49Z) - UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization [19.673388630963807]
We propose UniCBE, a unified uniformity-driven CBE framework.
On the AlpacaEval benchmark, UniCBE saves over 17% of evaluation budgets while achieving a Pearson correlation with ground truth exceeding 0.995.
In scenarios where new models are continuously introduced, UniCBE can even save over 50% of evaluation costs.
arXiv Detail & Related papers (2025-02-17T05:28:12Z) - Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Understanding and Scaling Collaborative Filtering Optimization from the Perspective of Matrix Rank [48.02330727538905]
Collaborative Filtering (CF) methods dominate real-world recommender systems.
We study the properties of the embedding tables under different learning strategies.
We propose an efficient warm-start strategy that regularizes the stable rank of the user and item embeddings.
arXiv Detail & Related papers (2024-10-15T21:54:13Z) - Mitigating Catastrophic Forgetting in Task-Incremental Continual
Learning with Adaptive Classification Criterion [50.03041373044267]
We propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning.
Experiments show that CFL achieves state-of-the-art performance and has a stronger ability to overcome compared with the classification baselines.
arXiv Detail & Related papers (2023-05-20T19:22:40Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Feature Selection via the Intervened Interpolative Decomposition and its
Application in Diversifying Quantitative Strategies [4.913248451323163]
We propose a probabilistic model for computing an interpolative decomposition (ID) in which each column of the observed matrix has its own priority or importance.
We evaluate the proposed models on real-world datasets, including ten Chinese A-share stocks.
arXiv Detail & Related papers (2022-09-29T03:36:56Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Parallel feature selection based on the trace ratio criterion [4.30274561163157]
This work presents a novel parallel feature selection approach for classification, namely Parallel Feature Selection using Trace criterion (PFST)
Our method uses trace criterion, a measure of class separability used in Fisher's Discriminant Analysis, to evaluate feature usefulness.
The experiments show that our method can produce a small set of features in a fraction of the amount of time by the other methods under comparison.
arXiv Detail & Related papers (2022-03-03T10:50:33Z) - A Supervised Feature Selection Method For Mixed-Type Data using
Density-based Feature Clustering [1.3048920509133808]
This paper proposes a supervised feature selection method using density-based feature clustering (SFSDFC)
SFSDFC decomposes the feature space into a set of disjoint feature clusters using a novel density-based clustering method.
Then, an effective feature selection strategy is employed to obtain a subset of important features with minimal redundancy from those feature clusters.
arXiv Detail & Related papers (2021-11-10T15:05:15Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - New advances in enumerative biclustering algorithms with online
partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets.
The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z) - Outlier Detection Ensemble with Embedded Feature Selection [42.8338013000469]
We propose an outlier detection ensemble framework with embedded feature selection (ODEFS)
For each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation.
We adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection.
arXiv Detail & Related papers (2020-01-15T13:14:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.