A Preconditioned Interior Point Method for Support Vector Machines Using
an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products
- URL: http://arxiv.org/abs/2312.00538v1
- Date: Fri, 1 Dec 2023 12:27:11 GMT
- Title: A Preconditioned Interior Point Method for Support Vector Machines Using
an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products
- Authors: Theresa Wagner, John W. Pearson, Martin Stoll
- Abstract summary: We propose an NFFT-accelerated matrix-vector product using an ANOVA decomposition for the feature space that is used within an interior point method for the overall optimization problem.
We investigate the performance of the different preconditioners as well as the accuracy of the ANOVA kernel on several large-scale datasets.
- Score: 0.6445605125467574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we consider the numerical solution to the soft-margin support
vector machine optimization problem. This problem is typically solved using the
SMO algorithm, given the high computational complexity of traditional
optimization algorithms when dealing with large-scale kernel matrices. In this
work, we propose employing an NFFT-accelerated matrix-vector product using an
ANOVA decomposition for the feature space that is used within an interior point
method for the overall optimization problem. As this method requires the
solution of a linear system of saddle point form we suggest a preconditioning
approach that is based on low-rank approximations of the kernel matrix together
with a Krylov subspace solver. We compare the accuracy of the ANOVA-based
kernel with the default LIBSVM implementation. We investigate the performance
of the different preconditioners as well as the accuracy of the ANOVA kernel on
several large-scale datasets.
Related papers
- The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs)
It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques.
Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z) - A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant
Analysis [3.4806267677524896]
Wasserstein discriminant analysis (WDA) is a linear dimensionality reduction method.
WDA can account for both global and local interconnections between data classes.
A bi-level nonlinear eigenvector algorithm (WDA-nepv) is presented.
arXiv Detail & Related papers (2022-11-21T22:40:43Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Feature subset selection for kernel SVM classification via mixed-integer
optimization [0.7734726150561088]
We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification.
First proposed for linear regression in the 1970s, this approach has recently moved into the spotlight with advances in optimization algorithms and computer hardware.
We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization software.
arXiv Detail & Related papers (2022-05-28T04:01:40Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Learning in High-Dimensional Feature Spaces Using ANOVA-Based Fast
Matrix-Vector Multiplication [0.0]
kernel matrix is typically dense and large-scale. Depending on the dimension of the feature space even the computation of all of its entries in reasonable time becomes a challenging task.
We propose the use of an ANOVA kernel, where we construct several kernels based on lower-dimensional feature spaces for which we provide fast algorithms realizing the matrix-vector products.
Based on a feature grouping approach, we then show how the fast matrix-vector products can be embedded into a learning method choosing kernel ridge regression and the preconditioned conjugate gradient solver.
arXiv Detail & Related papers (2021-11-19T10:29:39Z) - Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector
Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously.
We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z) - On the Efficient Implementation of the Matrix Exponentiated Gradient
Algorithm for Low-Rank Matrix Optimization [26.858608065417663]
Convex optimization over the spectrahedron has important applications in machine learning, signal processing and statistics.
We propose efficient implementations of MEG, which are tailored for optimization with low-rank matrices, and only use a single low-rank SVD on each iteration.
We also provide efficiently-computable certificates for the correct convergence of our methods.
arXiv Detail & Related papers (2020-12-18T19:14:51Z) - Follow the bisector: a simple method for multi-objective optimization [65.83318707752385]
We consider optimization problems, where multiple differentiable losses have to be minimized.
The presented method computes descent direction in every iteration to guarantee equal relative decrease of objective functions.
arXiv Detail & Related papers (2020-07-14T09:50:33Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.