Related papers: A Semismooth-Newton's-Method-Based Linearization and Approximation Approach for Kernel Support Vector Machines

A Semismooth-Newton's-Method-Based Linearization and Approximation Approach for Kernel Support Vector Machines

URL: http://arxiv.org/abs/2007.11954v1
Date: Tue, 21 Jul 2020 07:44:21 GMT
Title: A Semismooth-Newton's-Method-Based Linearization and Approximation Approach for Kernel Support Vector Machines
Authors: Chen Jiang and Qingna Li
Abstract summary: Support Vector Machines (SVMs) are among the most popular and the best performing classification algorithms. In this paper, we propose a semismooth Newton's method based linearization approximation approach for kernel SVMs. The advantage of the proposed approach is that it maintains low computational cost and keeps a fast convergence rate.
Score: 1.177306187948666
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Support Vector Machines (SVMs) are among the most popular and the best performing classification algorithms. Various approaches have been proposed to reduce the high computation and memory cost when training and predicting based on large-scale datasets with kernel SVMs. A popular one is the linearization framework, which successfully builds a bridge between the $L_1$-loss kernel SVM and the $L_1$-loss linear SVM. For linear SVMs, very recently, a semismooth Newton's method is proposed. It is shown to be very competitive and have low computational cost. Consequently, a natural question is whether it is possible to develop a fast semismooth Newton's algorithm for kernel SVMs. Motivated by this question and the idea in linearization framework, in this paper, we focus on the $L_2$-loss kernel SVM and propose a semismooth Newton's method based linearization and approximation approach for it. The main idea of this approach is to first set up an equivalent linear SVM, then apply the Nystr\"om method to approximate the kernel matrix, based on which a reduced linear SVM is obtained. Finally, the fast semismooth Newton's method is employed to solve the reduced linear SVM. We also provide some theoretical analyses on the approximation of the kernel matrix. The advantage of the proposed approach is that it maintains low computational cost and keeps a fast convergence rate. Results of extensive numerical experiments verify the efficiency of the proposed approach in terms of both predicting accuracy and speed.

Related papers

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning [1.0282274843007797]
We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties.<n>To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems.
arXiv Detail & Related papers (2025-07-06T05:17:28Z)
The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs) It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques. Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z)
Multi-class Support Vector Machine with Maximizing Minimum Margin [67.51047882637688]
Support Vector Machine (SVM) is a prominent machine learning technique widely applied in pattern recognition tasks. We propose a novel method for multi-class SVM that incorporates pairwise class loss considerations and maximizes the minimum margin. Empirical evaluations demonstrate the effectiveness and superiority of our proposed method over existing multi-classification methods.
arXiv Detail & Related papers (2023-12-11T18:09:55Z)
Snacks: a fast large-scale kernel SVM solver [0.8602553195689513]
Snacks is a new large-scale solver for Kernel Support Vector Machines. Snacks relies on a Nystr"om approximation of the kernel matrix and an accelerated variant of the subgradient method.
arXiv Detail & Related papers (2023-04-17T04:19:20Z)
High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix. Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm. On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z)
Tensor Network Kalman Filtering for Large-Scale LS-SVMs [17.36231167296782]
Least squares support vector machines are used for nonlinear regression and classification. A framework based on tensor networks and the Kalman filter is presented to alleviate the demanding memory and computational complexities. Results show that our method can achieve high performance and is particularly useful when alternative methods are computationally infeasible.
arXiv Detail & Related papers (2021-10-26T08:54:03Z)
ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems. Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations. We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z)
Estimating Average Treatment Effects with Support Vector Machines [77.34726150561087]
Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature. We adapt SVM as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups. We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods.
arXiv Detail & Related papers (2021-02-23T20:22:56Z)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which. machines can independently compute gradients, and cooperate. Our algorithm is based on a new concentration technique, and its sample complexity. It is very practical: it improves upon the performance of all prior methods when no. setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
A quantum extension of SVM-perf for training nonlinear SVMs in almost linear time [0.2855485723554975]
We propose a quantum algorithm for training nonlinear support vector machines (SVM) for feature space learning. Based on the classical SVM-perf algorithm of Joachims, our algorithm has a running time which scales linearly in the number of training examples.
arXiv Detail & Related papers (2020-06-18T06:25:45Z)
Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS Algorithm [8.571896191090744]
We show that a Biweight kernel is optimal in the sense of minimizing an mean squared error of a resulting MLR parameter. Secondly, we provide a kernel class for which algorithm iteratively reweighted least-squares algorithm (IRLS) is guaranteed to converge.
arXiv Detail & Related papers (2020-01-30T03:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.