On support vector machines under a multiple-cost scenario
- URL: http://arxiv.org/abs/2312.14795v1
- Date: Fri, 22 Dec 2023 16:12:25 GMT
- Title: On support vector machines under a multiple-cost scenario
- Authors: Sandra Ben\'itez-Pe\~na and Rafael Blanquero and Emilio Carrizosa and
Pepa Ram\'irez-Cobo
- Abstract summary: Support Vector Machine (SVM) is a powerful tool in binary classification.
We propose a novel SVM model in which misclassification costs are considered by incorporating performance constraints.
- Score: 1.743685428161914
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Support Vector Machine (SVM) is a powerful tool in binary classification,
known to attain excellent misclassification rates. On the other hand, many
realworld classification problems, such as those found in medical diagnosis,
churn or fraud prediction, involve misclassification costs which may be
different in the different classes. However, it may be hard for the user to
provide precise values for such misclassification costs, whereas it may be much
easier to identify acceptable misclassification rates values. In this paper we
propose a novel SVM model in which misclassification costs are considered by
incorporating performance constraints in the problem formulation. Specifically,
our aim is to seek the hyperplane with maximal margin yielding
misclassification rates below given threshold values. Such maximal margin
hyperplane is obtained by solving a quadratic convex problem with linear
constraints and integer variables. The reported numerical experience shows that
our model gives the user control on the misclassification rates in one class
(possibly at the expense of an increase in misclassification rates for the
other class) and is feasible in terms of running times.
Related papers
- Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models [0.0]
Confidence calibration is an emerging challenge in real-world decision systems based on foundations models.
We propose a penalty incorporated into loss objective that penalizes incorrect classifications whenever one is made during finetuning.
We refer to it as textitconfidence misalignment penalty (CMP).
arXiv Detail & Related papers (2025-01-29T11:54:37Z) - Classification Error Bound for Low Bayes Error Conditions in Machine Learning [50.25063912757367]
We study the relationship between the error mismatch and the Kullback-Leibler divergence in machine learning.
Motivated by recent observations of low model-based classification errors in many machine learning tasks, we propose a linear approximation of the classification error bound for low Bayes error conditions.
arXiv Detail & Related papers (2025-01-27T11:57:21Z) - Wasserstein Distributionally Robust Multiclass Support Vector Machine [1.8570591025615457]
We study the problem of multiclass classification for settings where data features $mathbfx$ and their labels $mathbfy$ are uncertain.
We use Wasserstein distributionally robust optimization to develop a robust version of the multiclass support vector machine (SVM) characterized by the Crammer-Singer (CS) loss.
Our numerical experiments demonstrate that our model outperforms state-of-the art OVA models in settings where the training data is highly imbalanced.
arXiv Detail & Related papers (2024-09-12T21:40:04Z) - Understanding the Detrimental Class-level Effects of Data Augmentation [63.1733767714073]
achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet.
We present a framework for understanding how DA interacts with class-level learning dynamics.
We show that simple class-conditional augmentation strategies improve performance on the negatively affected classes.
arXiv Detail & Related papers (2023-12-07T18:37:43Z) - Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - Online Selective Classification with Limited Feedback [82.68009460301585]
We study selective classification in the online learning model, wherein a predictor may abstain from classifying an instance.
Two salient aspects of the setting we consider are that the data may be non-realisable, due to which abstention may be a valid long-term action.
We construct simple versioning-based schemes for any $mu in (0,1],$ that make most $Tmu$ mistakes while incurring smash$tildeO(T1-mu)$ excess abstention against adaptive adversaries.
arXiv Detail & Related papers (2021-10-27T08:00:53Z) - Risk Bounds for Over-parameterized Maximum Margin Classification on
Sub-Gaussian Mixtures [100.55816326422773]
We study the phenomenon of the maximum margin classifier for linear classification problems.
Our results precisely characterize the condition under which benign overfitting can occur.
arXiv Detail & Related papers (2021-04-28T08:25:16Z) - A Minimax Probability Machine for Non-Decomposable Performance Measures [15.288802707471792]
Imbalanced classification tasks are widespread in many real-world applications.
The minimax probability machine is a popular method for binary classification problems.
This paper develops a new minimax probability machine for the $F_beta$ measure, called MPMF, which can be used to deal with imbalanced classification tasks.
arXiv Detail & Related papers (2021-02-28T04:58:46Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - High-Dimensional Quadratic Discriminant Analysis under Spiked Covariance
Model [101.74172837046382]
We propose a novel quadratic classification technique, the parameters of which are chosen such that the fisher-discriminant ratio is maximized.
Numerical simulations show that the proposed classifier not only outperforms the classical R-QDA for both synthetic and real data but also requires lower computational complexity.
arXiv Detail & Related papers (2020-06-25T12:00:26Z) - Angle-Based Cost-Sensitive Multicategory Classification [34.174072286426885]
We propose a novel angle-based cost-sensitive classification framework for multicategory classification without the sum-to-zero constraint.
To show the usefulness of the framework, two cost-sensitive multicategory boosting algorithms are derived as concrete instances.
arXiv Detail & Related papers (2020-03-08T00:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.