Related papers: Cost-sensitive Feature Selection for Support Vector Machines

Cost-sensitive Feature Selection for Support Vector Machines

URL: http://arxiv.org/abs/2401.07627v1
Date: Mon, 15 Jan 2024 12:07:52 GMT
Title: Cost-sensitive Feature Selection for Support Vector Machines
Authors: Sandra Ben\'itez-Pe\~na and Rafael Blanquero and Emilio Carrizosa and Pepa Ram\'irez-Cobo
Abstract summary: We propose a mathematical-optimization-based Feature Selection procedure embedded in one of the most popular classification procedures, Support Vector Machines. We show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved.
Score: 1.743685428161914
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf Feature Selection procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based Feature Selection procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines, accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for Support Vector Machines with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed Feature Selection procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved.

Related papers

Ask for More Than Bayes Optimal: A Theory of Indecisions for Classification [1.8434042562191815]
Selective classification is a powerful tool for automated decision-making in high-risk scenarios. Our goal is to minimize the number of indecisions, which are observations that we do not automate. By using indecisions, we are able to control the misclassification rate to any user-specified level, even below the Bayes optimal error rate.
arXiv Detail & Related papers (2024-12-17T11:25:51Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy. We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods. By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
Implicit Regularization for Multi-label Feature Selection [1.5771347525430772]
We address the problem of feature selection in the context of multi-label learning by using a new estimator based on implicit regularization and label embedding. Experimental results on some known benchmark datasets suggest that the proposed estimator suffers much less from extra bias, and may lead to benign overfitting.
arXiv Detail & Related papers (2024-11-18T10:08:05Z)
Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data. We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures. We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z)
Nonparametric active learning for cost-sensitive classification [2.1756081703276]
We design a generic nonparametric active learning algorithm for cost-sensitive classification. We prove the near-optimality of obtained upper bounds by providing matching (up to logarithmic factor) lower bounds.
arXiv Detail & Related papers (2023-09-30T22:19:21Z)
Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor Problem [8.281391209717105]
We study the feature-based news vendor problem, in which a decision-maker has access to historical data. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers.
arXiv Detail & Related papers (2022-09-12T08:52:26Z)
Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC) AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability. We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z)
Determination of class-specific variables in nonparametric multiple-class classification [0.0]
We propose a probability-based nonparametric multiple-class classification method, and integrate it with the ability of identifying high impact variables for individual class. We report the properties of the proposed method, and use both synthesized and real data sets to illustrate its properties under different classification situations.
arXiv Detail & Related papers (2022-05-07T10:08:58Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Gradient Descent in RKHS with Importance Labeling [58.79085525115987]
We study importance labeling problem, in which we are given many unlabeled data. We propose a new importance labeling scheme that can effectively select an informative subset of unlabeled data.
arXiv Detail & Related papers (2020-06-19T01:55:00Z)
A novel embedded min-max approach for feature selection in nonlinear support vector machine classification [0.0]
We propose an embedded feature selection method based on a min-max optimization problem. By leveraging duality theory, we equivalently reformulate the min-max problem and solve it without further ado. The efficiency and usefulness of our approach are tested on several benchmark data sets.
arXiv Detail & Related papers (2020-04-21T09:40:38Z)
Implicit differentiation of Lasso-type models for hyperparameter optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z)
Supervised Quantile Normalization for Low-rank Matrix Approximation [50.445371939523305]
We learn the parameters of quantile normalization operators that can operate row-wise on the values of $X$ and/or of its factorization $UV$ to improve the quality of the low-rank representation of $X$ itself. We demonstrate the applicability of these techniques on synthetic and genomics datasets.
arXiv Detail & Related papers (2020-02-08T21:06:02Z)
Naive Feature Selection: a Nearly Tight Convex Relaxation for Sparse Naive Bayes [51.55826927508311]
We propose a sparse version of naive Bayes, which can be used for feature selection. We prove that our convex relaxation bounds becomes tight as the marginal contribution of additional features decreases. Both binary and multinomial sparse models are solvable in time almost linear in problem size.
arXiv Detail & Related papers (2019-05-23T19:30:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.