A Semismooth-Newton's-Method-Based Linearization and Approximation
Approach for Kernel Support Vector Machines
- URL: http://arxiv.org/abs/2007.11954v1
- Date: Tue, 21 Jul 2020 07:44:21 GMT
- Title: A Semismooth-Newton's-Method-Based Linearization and Approximation
Approach for Kernel Support Vector Machines
- Authors: Chen Jiang and Qingna Li
- Abstract summary: Support Vector Machines (SVMs) are among the most popular and the best performing classification algorithms.
In this paper, we propose a semismooth Newton's method based linearization approximation approach for kernel SVMs.
The advantage of the proposed approach is that it maintains low computational cost and keeps a fast convergence rate.
- Score: 1.177306187948666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Support Vector Machines (SVMs) are among the most popular and the best
performing classification algorithms. Various approaches have been proposed to
reduce the high computation and memory cost when training and predicting based
on large-scale datasets with kernel SVMs. A popular one is the linearization
framework, which successfully builds a bridge between the $L_1$-loss kernel SVM
and the $L_1$-loss linear SVM. For linear SVMs, very recently, a semismooth
Newton's method is proposed. It is shown to be very competitive and have low
computational cost. Consequently, a natural question is whether it is possible
to develop a fast semismooth Newton's algorithm for kernel SVMs. Motivated by
this question and the idea in linearization framework, in this paper, we focus
on the $L_2$-loss kernel SVM and propose a semismooth Newton's method based
linearization and approximation approach for it. The main idea of this approach
is to first set up an equivalent linear SVM, then apply the Nystr\"om method to
approximate the kernel matrix, based on which a reduced linear SVM is obtained.
Finally, the fast semismooth Newton's method is employed to solve the reduced
linear SVM. We also provide some theoretical analyses on the approximation of
the kernel matrix. The advantage of the proposed approach is that it maintains
low computational cost and keeps a fast convergence rate. Results of extensive
numerical experiments verify the efficiency of the proposed approach in terms
of both predicting accuracy and speed.
Related papers
- The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs)
It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques.
Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z) - Multi-class Support Vector Machine with Maximizing Minimum Margin [67.51047882637688]
Support Vector Machine (SVM) is a prominent machine learning technique widely applied in pattern recognition tasks.
We propose a novel method for multi-class SVM that incorporates pairwise class loss considerations and maximizes the minimum margin.
Empirical evaluations demonstrate the effectiveness and superiority of our proposed method over existing multi-classification methods.
arXiv Detail & Related papers (2023-12-11T18:09:55Z) - Snacks: a fast large-scale kernel SVM solver [0.8602553195689513]
Snacks is a new large-scale solver for Kernel Support Vector Machines.
Snacks relies on a Nystr"om approximation of the kernel matrix and an accelerated variant of the subgradient method.
arXiv Detail & Related papers (2023-04-17T04:19:20Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Tensor Network Kalman Filtering for Large-Scale LS-SVMs [17.36231167296782]
Least squares support vector machines are used for nonlinear regression and classification.
A framework based on tensor networks and the Kalman filter is presented to alleviate the demanding memory and computational complexities.
Results show that our method can achieve high performance and is particularly useful when alternative methods are computationally infeasible.
arXiv Detail & Related papers (2021-10-26T08:54:03Z) - ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations.
We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z) - Estimating Average Treatment Effects with Support Vector Machines [77.34726150561087]
Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature.
We adapt SVM as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups.
We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods.
arXiv Detail & Related papers (2021-02-23T20:22:56Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - A quantum extension of SVM-perf for training nonlinear SVMs in almost
linear time [0.2855485723554975]
We propose a quantum algorithm for training nonlinear support vector machines (SVM) for feature space learning.
Based on the classical SVM-perf algorithm of Joachims, our algorithm has a running time which scales linearly in the number of training examples.
arXiv Detail & Related papers (2020-06-18T06:25:45Z) - Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS
Algorithm [8.571896191090744]
We show that a Biweight kernel is optimal in the sense of minimizing an mean squared error of a resulting MLR parameter.
Secondly, we provide a kernel class for which algorithm iteratively reweighted least-squares algorithm (IRLS) is guaranteed to converge.
arXiv Detail & Related papers (2020-01-30T03:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.