Related papers: A Preconditioned Interior Point Method for Support Vector Machines Using an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products

A Preconditioned Interior Point Method for Support Vector Machines Using an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products

URL: http://arxiv.org/abs/2312.00538v1
Date: Fri, 1 Dec 2023 12:27:11 GMT
Title: A Preconditioned Interior Point Method for Support Vector Machines Using an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products
Authors: Theresa Wagner, John W. Pearson, Martin Stoll
Abstract summary: We propose an NFFT-accelerated matrix-vector product using an ANOVA decomposition for the feature space that is used within an interior point method for the overall optimization problem. We investigate the performance of the different preconditioners as well as the accuracy of the ANOVA kernel on several large-scale datasets.
Score: 0.6445605125467574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we consider the numerical solution to the soft-margin support vector machine optimization problem. This problem is typically solved using the SMO algorithm, given the high computational complexity of traditional optimization algorithms when dealing with large-scale kernel matrices. In this work, we propose employing an NFFT-accelerated matrix-vector product using an ANOVA decomposition for the feature space that is used within an interior point method for the overall optimization problem. As this method requires the solution of a linear system of saddle point form we suggest a preconditioning approach that is based on low-rank approximations of the kernel matrix together with a Krylov subspace solver. We compare the accuracy of the ANOVA-based kernel with the default LIBSVM implementation. We investigate the performance of the different preconditioners as well as the accuracy of the ANOVA kernel on several large-scale datasets.

Related papers

A Gradient Meta-Learning Joint Optimization for Beamforming and Antenna Position in Pinching-Antenna Systems [63.213207442368294]
We consider a novel optimization design for multi-waveguide pinching-antenna systems.<n>The proposed GML-JO algorithm is robust to different choices and better performance compared with the existing optimization methods.
arXiv Detail & Related papers (2025-06-14T17:35:27Z)
The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs) It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques. Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z)
A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant Analysis [3.4806267677524896]
Wasserstein discriminant analysis (WDA) is a linear dimensionality reduction method. WDA can account for both global and local interconnections between data classes. A bi-level nonlinear eigenvector algorithm (WDA-nepv) is presented.
arXiv Detail & Related papers (2022-11-21T22:40:43Z)
Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning. In this work, we present the first trial in the single-dependent generalization of AUPRC optimization. Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z)
Feature subset selection for kernel SVM classification via mixed-integer optimization [0.7734726150561088]
We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. First proposed for linear regression in the 1970s, this approach has recently moved into the spotlight with advances in optimization algorithms and computer hardware. We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization software.
arXiv Detail & Related papers (2022-05-28T04:01:40Z)
High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix. Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm. On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z)
Learning in High-Dimensional Feature Spaces Using ANOVA-Based Fast Matrix-Vector Multiplication [0.0]
kernel matrix is typically dense and large-scale. Depending on the dimension of the feature space even the computation of all of its entries in reasonable time becomes a challenging task. We propose the use of an ANOVA kernel, where we construct several kernels based on lower-dimensional feature spaces for which we provide fast algorithms realizing the matrix-vector products. Based on a feature grouping approach, we then show how the fast matrix-vector products can be embedded into a learning method choosing kernel ridge regression and the preconditioned conjugate gradient solver.
arXiv Detail & Related papers (2021-11-19T10:29:39Z)
Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously. We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z)
On the Efficient Implementation of the Matrix Exponentiated Gradient Algorithm for Low-Rank Matrix Optimization [26.858608065417663]
Convex optimization over the spectrahedron has important applications in machine learning, signal processing and statistics. We propose efficient implementations of MEG, which are tailored for optimization with low-rank matrices, and only use a single low-rank SVD on each iteration. We also provide efficiently-computable certificates for the correct convergence of our methods.
arXiv Detail & Related papers (2020-12-18T19:14:51Z)
Follow the bisector: a simple method for multi-objective optimization [65.83318707752385]
We consider optimization problems, where multiple differentiable losses have to be minimized. The presented method computes descent direction in every iteration to guarantee equal relative decrease of objective functions.
arXiv Detail & Related papers (2020-07-14T09:50:33Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.