Efficient Convex Algorithms for Universal Kernel Learning
- URL: http://arxiv.org/abs/2304.07472v2
- Date: Sat, 24 Feb 2024 20:47:43 GMT
- Title: Efficient Convex Algorithms for Universal Kernel Learning
- Authors: Aleksandr Talitckii and Brendon K. Colbert and Matthew M. Peet
- Abstract summary: An ideal set of kernels should: admit a linear parameterization (for tractability); dense in the set of all kernels (for accuracy)
Previous algorithms for optimization of kernels were limited to classification and relied on computationally complex Semidefinite Programming (SDP) algorithms.
We propose a SVD-QCQPQP algorithm which dramatically reduces the computational complexity as compared with previous SDP-based approaches.
- Score: 50.877957471649395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The accuracy and complexity of machine learning algorithms based on kernel
optimization are determined by the set of kernels over which they are able to
optimize. An ideal set of kernels should: admit a linear parameterization (for
tractability); be dense in the set of all kernels (for robustness); be
universal (for accuracy). Recently, a framework was proposed for using positive
matrices to parameterize a class of positive semi-separable kernels. Although
this class can be shown to meet all three criteria, previous algorithms for
optimization of such kernels were limited to classification and furthermore
relied on computationally complex Semidefinite Programming (SDP) algorithms. In
this paper, we pose the problem of learning semiseparable kernels as a minimax
optimization problem and propose a SVD-QCQP primal-dual algorithm which
dramatically reduces the computational complexity as compared with previous
SDP-based approaches. Furthermore, we provide an efficient implementation of
this algorithm for both classification and regression -- an implementation
which enables us to solve problems with 100 features and up to 30,000 datums.
Finally, when applied to benchmark data, the algorithm demonstrates the
potential for significant improvement in accuracy over typical (but non-convex)
approaches such as Neural Nets and Random Forest with similar or better
computation time.
Related papers
- Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Decentralized Stochastic Bilevel Optimization with Improved
per-Iteration Complexity [17.870370505179014]
We propose a novel decentralized bilevel optimization (DSBO) algorithm that only requires first order oracle, Hessian-vector product and Jacobian-vector product.
The advantage of our algorithm is that it does not require estimating the full Hessian and Jacobian matrices, thereby having improved per-it complexity.
arXiv Detail & Related papers (2022-10-23T20:06:05Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - A New Algorithm for Tessellated Kernel Learning [4.264192013842097]
An ideal set of kernels should: admit a linear parameterization (for tractability); be dense in the set of all kernels (for robustness); be universal (for accuracy)
The recently proposed Tesselated Kernels (TKs) is currently the only known class which meets all three criteria.
By contrast, the 2-step algorithm proposed here scales to 10,000 data points and extends to the regression problem.
arXiv Detail & Related papers (2020-06-13T18:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.