Related papers: Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression

Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression

URL: http://arxiv.org/abs/2108.08605v1
Date: Thu, 19 Aug 2021 10:30:12 GMT
Title: Using Multilevel Circulant Matrix Approximate to Speed Up Kernel Logistic Regression
Authors: Junna~Zhang, Shuisheng~Zhou,~Cui~Fu and Zhuan Zhang
Abstract summary: We employ the multilevel circulant matrix (MCM) approximate kernel matrix to save in storage space and accelerate the solution of the KLR. Our method makes KLR scalable for large-scale problems, with less memory consumption, and converges to test accuracy without sacrifice in a shorter time.
Score: 3.1427994341585688
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Kernel logistic regression (KLR) is a classical nonlinear classifier in statistical machine learning. Newton method with quadratic convergence rate can solve KLR problem more effectively than the gradient method. However, an obvious limitation of Newton method for training large-scale problems is the $O(n^{3})$ time complexity and $O(n^{2})$ space complexity, where $n$ is the number of training instances. In this paper, we employ the multilevel circulant matrix (MCM) approximate kernel matrix to save in storage space and accelerate the solution of the KLR. Combined with the characteristics of MCM and our ingenious design, we propose an MCM approximate Newton iterative method. We first simplify the Newton direction according to the semi-positivity of the kernel matrix and then perform a two-step approximation of the Newton direction by using MCM. Our method reduces the time complexity of each iteration to $O(n \log n)$ by using the multidimensional fast Fourier transform (mFFT). In addition, the space complexity can be reduced to $O(n)$ due to the built-in periodicity of MCM. Experimental results on some large-scale binary and multi-classification problems show that our method makes KLR scalable for large-scale problems, with less memory consumption, and converges to test accuracy without sacrifice in a shorter time.

Related papers

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning [1.0282274843007797]
We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties.<n>To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems.
arXiv Detail & Related papers (2025-07-06T05:17:28Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Large-Scale Gaussian Processes via Alternating Projection [23.79090469387859]
We propose an iterative method which only accesses subblocks of the kernel matrix, effectively enabling mini-batching. Our algorithm, based on alternating projection, has $mathcalO(n)$ per-iteration time and space complexity, solving many of the practical challenges of scaling GPs to very large datasets.
arXiv Detail & Related papers (2023-10-26T04:20:36Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold [16.099883128428054]
We propose a novel algorithm, the Orthogonal Directions Constrained Method (ODCGM) ODCGM only requires computing a projection onto a vector space. We show that ODCGM exhibits the near-optimal oracle complexities.
arXiv Detail & Related papers (2023-03-16T12:25:53Z)
Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems. We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z)
FKreg: A MATLAB toolbox for fast Multivariate Kernel Regression [5.090316990822874]
We introduce a new toolbox for fast multivariate kernel regression with the idea of non-uniform FFT (NUFFT) NUFFT implements the algorithm for $M$ gridding points with $Oleft( N+Mlog M right)$ complexity and accuracy controllability. The bandwidth selection problem utilizes the Fast Monte-Carlo to estimate the degree of freedom.
arXiv Detail & Related papers (2022-04-16T04:52:44Z)
Fast Differentiable Matrix Square Root and Inverse Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root and the inverse square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad'e Approximants (MPA) A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration.
arXiv Detail & Related papers (2022-01-29T10:00:35Z)
Fast $L^2$ optimal mass transport via reduced basis methods for the Monge-Amp$\grave{\rm e}$re equation [0.0]
We propose a machine learning-like method for solving the parameterized Monge-Amp$graverm e$re equation. Several challenging numerical tests demonstrate the accuracy and high efficiency of our method for solving the Monge-Amp$graverm e$re equation.
arXiv Detail & Related papers (2021-12-03T12:30:46Z)
Mixability made efficient: Fast online multiclass logistic regression [68.8204255655161]
We show that mixability can be a powerful tool to obtain algorithms with optimal regret. The resulting methods often suffer from high computational complexity which has reduced their practical applicability.
arXiv Detail & Related papers (2021-10-08T08:22:05Z)
Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration. We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching. We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z)
The Fast Kernel Transform [21.001203328543006]
We propose the Fast Kernel Transform (FKT), a general algorithm to compute matrix-vector multiplications for datasets in moderate dimensions with quasilinear complexity. The FKT is easily applied to a broad class of kernels, including Gaussian, Matern, and Rational Quadratic covariance functions and physically motivated Green's functions. We illustrate the efficacy and versatility of the FKT by providing timing and accuracy benchmarks and by applying it to scale the neighborhood embedding (t-SNE) and Gaussian processes to large real-world data sets.
arXiv Detail & Related papers (2021-06-08T16:15:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.