Kernel Support Vector Machine Classifiers with the $\ell_0$-Norm Hinge
Loss
- URL: http://arxiv.org/abs/2306.13991v1
- Date: Sat, 24 Jun 2023 14:52:44 GMT
- Title: Kernel Support Vector Machine Classifiers with the $\ell_0$-Norm Hinge
Loss
- Authors: Rongrong Lin, Yingjia Yao, Yulan Liu
- Abstract summary: Support Vector Machine (SVM) has been one of the most successful machine learning techniques for binary classification problems.
This paper is concentrated on vectors with hinge loss (referred as $ell$-KSVM), which is a composite function of hinge loss and $ell_$norm.
Experiments on the synthetic and real datasets are illuminated to show that $ell_$-KSVM can achieve comparable accuracy compared with the standard KSVM.
- Score: 3.007949058551534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Support Vector Machine (SVM) has been one of the most successful machine
learning techniques for binary classification problems. The key idea is to
maximize the margin from the data to the hyperplane subject to correct
classification on training samples. The commonly used hinge loss and its
variations are sensitive to label noise, and unstable for resampling due to its
unboundedness. This paper is concentrated on the kernel SVM with the
$\ell_0$-norm hinge loss (referred as $\ell_0$-KSVM), which is a composite
function of hinge loss and $\ell_0$-norm and then could overcome the
difficulties mentioned above. In consideration of the nonconvexity and
nonsmoothness of $\ell_0$-norm hinge loss, we first characterize the limiting
subdifferential of the $\ell_0$-norm hinge loss and then derive the equivalent
relationship among the proximal stationary point, the Karush-Kuhn-Tucker point,
and the local optimal solution of $\ell_0$-KSVM. Secondly, we develop an ADMM
algorithm for $\ell_0$-KSVM, and obtain that any limit point of the sequence
generated by the proposed algorithm is a locally optimal solution. Lastly, some
experiments on the synthetic and real datasets are illuminated to show that
$\ell_0$-KSVM can achieve comparable accuracy compared with the standard KSVM
while the former generally enjoys fewer support vectors.
Related papers
- $p$SVM: Soft-margin SVMs with $p$-norm Hinge Loss [0.0]
Support Vector Machines (SVMs) based on hinge loss have been extensively discussed and applied to various binary classification tasks.
In this paper, we explore the properties, performance, and training algorithms of $p$SVMs.
arXiv Detail & Related papers (2024-08-19T11:30:00Z) - A Novel Loss Function-based Support Vector Machine for Binary Classification [3.773980481058198]
We propose a novel Slide loss function ($ell_s$) to construct the support vector machine classifier($ell_s$-SVM)
By introducing the concept of proximal stationary point, and utilizing the property of Lipschitz continuity, we derive the first-order optimality conditions for $ell_s$-SVM.
To efficiently handle $ell_s$-SVM, we devise a fast alternating direction method of multipliers with the working set.
arXiv Detail & Related papers (2024-03-25T11:42:01Z) - Transformers as Support Vector Machines [54.642793677472724]
We establish a formal equivalence between the optimization geometry of self-attention and a hard-margin SVM problem.
We characterize the implicit bias of 1-layer transformers optimized with gradient descent.
We believe these findings inspire the interpretation of transformers as a hierarchy of SVMs that separates and selects optimal tokens.
arXiv Detail & Related papers (2023-08-31T17:57:50Z) - High-Dimensional Penalized Bernstein Support Vector Machines [0.0]
Non-differentiability of the SVM hinge loss function can lead to computational difficulties in high dimensional settings.
We propose two efficient algorithms for computing the solution of the penalized BernSVM.
Our bound holds with high probability and achieves a rate of order $sqrtslog(p)/n$, where $s$ is the number of active features.
arXiv Detail & Related papers (2023-03-16T03:48:29Z) - Handling Imbalanced Classification Problems With Support Vector Machines
via Evolutionary Bilevel Optimization [73.17488635491262]
Support vector machines (SVMs) are popular learning algorithms to deal with binary classification problems.
This article introduces EBCS-SVM: evolutionary bilevel cost-sensitive SVMs.
arXiv Detail & Related papers (2022-04-21T16:08:44Z) - Nonlinear Kernel Support Vector Machine with 0-1 Soft Margin Loss [13.803988813225025]
We propose a nonlinear model for support vector machine with 0-1 soft margin loss, called $L_0/1$-KSVM, which cunningly involves the kernel technique into it.
$L_0/1$-KSVM has much fewer SVs, simultaneously with a decent predicting accuracy, when compared to its linear peer.
arXiv Detail & Related papers (2022-03-01T12:53:52Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Consistent Structured Prediction with Max-Min Margin Markov Networks [84.60515484036239]
Max-margin methods for binary classification have been extended to the structured prediction setting under the name of max-margin Markov networks ($M3N$)
We overcome such limitations by defining the learning problem in terms of a "max-min" margin formulation, naming the resulting method max-min margin Markov networks ($M4N$)
Experiments on multi-class classification, ordinal regression, sequence prediction and ranking demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-07-02T10:48:42Z) - Unified SVM Algorithm Based on LS-DC Loss [0.0]
We propose an algorithm that can train different SVM models.
UniSVM has a dominant advantage over all existing algorithms because it has a closed-form solution.
Experiments show that UniSVM can achieve comparable performance in less training time.
arXiv Detail & Related papers (2020-06-16T12:40:06Z) - Non-Convex SGD Learns Halfspaces with Adversarial Label Noise [50.659479930171585]
We show the solution to the problem of learning surrogate learning homogeneous halfspaces in the distribution-specific model.
In any convex distributions, we show that the misclassification error inherently leads to misclassification error of halfspace.
arXiv Detail & Related papers (2020-06-11T18:55:59Z) - Kernel-Based Reinforcement Learning: A Finite-Time Analysis [53.47210316424326]
We introduce Kernel-UCBVI, a model-based optimistic algorithm that leverages the smoothness of the MDP and a non-parametric kernel estimator of the rewards.
We empirically validate our approach in continuous MDPs with sparse rewards.
arXiv Detail & Related papers (2020-04-12T12:23:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.