Loss-guided Stability Selection
- URL: http://arxiv.org/abs/2202.04956v1
- Date: Thu, 10 Feb 2022 11:20:25 GMT
- Title: Loss-guided Stability Selection
- Authors: Tino Werner
- Abstract summary: It is well-known that model selection procedures like the Lasso or Boosting tend to overfit on real data.
Standard Stability Selection is based on a global criterion, namely the per-family error rate.
We propose a Stability Selection variant which respects the chosen loss function via an additional validation step.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern data analysis, sparse model selection becomes inevitable once the
number of predictors variables is very high. It is well-known that model
selection procedures like the Lasso or Boosting tend to overfit on real data.
The celebrated Stability Selection overcomes these weaknesses by aggregating
models, based on subsamples of the training data, followed by choosing a stable
predictor set which is usually much sparser than the predictor sets from the
raw models. The standard Stability Selection is based on a global criterion,
namely the per-family error rate, while additionally requiring expert knowledge
to suitably configure the hyperparameters. Since model selection depends on the
loss function, i.e., predictor sets selected w.r.t. some particular loss
function differ from those selected w.r.t. some other loss function, we propose
a Stability Selection variant which respects the chosen loss function via an
additional validation step based on out-of-sample validation data, optionally
enhanced with an exhaustive search strategy. Our Stability Selection variants
are widely applicable and user-friendly. Moreover, our Stability Selection
variants can avoid the issue of severe underfitting which affects the original
Stability Selection for noisy high-dimensional data, so our priority is not to
avoid false positives at all costs but to result in a sparse stable model with
which one can make predictions. Experiments where we consider both regression
and binary classification and where we use Boosting as model selection
algorithm reveal a significant precision improvement compared to raw Boosting
models while not suffering from any of the mentioned issues of the original
Stability Selection.
Related papers
- Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample.
Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications.
We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z) - Uncertainty-aware Language Modeling for Selective Question Answering [107.47864420630923]
We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs.
Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems.
arXiv Detail & Related papers (2023-11-26T22:47:54Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Confidence-Based Model Selection: When to Take Shortcuts for
Subpopulation Shifts [119.22672589020394]
We propose COnfidence-baSed MOdel Selection (CosMoS), where model confidence can effectively guide model selection.
We evaluate CosMoS on four datasets with spurious correlations, each with multiple test sets with varying levels of data distribution shift.
arXiv Detail & Related papers (2023-06-19T18:48:15Z) - Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor
Problem [8.281391209717105]
We study the feature-based news vendor problem, in which a decision-maker has access to historical data.
In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance.
We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers.
arXiv Detail & Related papers (2022-09-12T08:52:26Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Cluster Stability Selection [2.3986080077861787]
Stability selection makes any feature selection method more stable by returning only those features that are consistently selected across many subsamples.
We introduce cluster stability selection, which exploits the practitioner's knowledge that highly correlated clusters exist in the data.
In summary, cluster stability selection enjoys the best of both worlds, yielding a sparse selected set that is both stable and has good predictive performance.
arXiv Detail & Related papers (2022-01-03T06:28:17Z) - Employing an Adjusted Stability Measure for Multi-Criteria Model Fitting
on Data Sets with Similar Features [0.1127980896956825]
We show that our approach achieves the same or better predictive performance compared to the two established approaches.
Our approach succeeds at selecting the relevant features while avoiding irrelevant or redundant features.
For data sets with many similar features, the feature selection stability must be evaluated with an adjusted stability measure.
arXiv Detail & Related papers (2021-06-15T12:48:07Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Leveraging Model Inherent Variable Importance for Stable Online Feature
Selection [16.396739487911056]
We introduce FIRES, a novel framework for online feature selection.
Our framework is generic in that it leaves the choice of the underlying model to the user.
Experiments show that the proposed framework is clearly superior in terms of feature selection stability.
arXiv Detail & Related papers (2020-06-18T10:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.