FKreg: A MATLAB toolbox for fast Multivariate Kernel Regression
- URL: http://arxiv.org/abs/2204.07716v1
- Date: Sat, 16 Apr 2022 04:52:44 GMT
- Title: FKreg: A MATLAB toolbox for fast Multivariate Kernel Regression
- Authors: Ying Wang, Min Li, Deirel Paz-Linares, Maria L. Bringas Vega, Pedro A.
Vald\'es-Sosa
- Abstract summary: We introduce a new toolbox for fast multivariate kernel regression with the idea of non-uniform FFT (NUFFT)
NUFFT implements the algorithm for $M$ gridding points with $Oleft( N+Mlog M right)$ complexity and accuracy controllability.
The bandwidth selection problem utilizes the Fast Monte-Carlo to estimate the degree of freedom.
- Score: 5.090316990822874
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Kernel smooth is the most fundamental technique for data density and
regression estimation. However, time-consuming is the biggest obstacle for the
application that the direct evaluation of kernel smooth for $N$ samples needs
${O}\left( {{N}^{2}} \right)$ operations. People have developed fast smooth
algorithms using the idea of binning with FFT. Unfortunately, the accuracy is
not controllable, and the implementation for multivariable and its bandwidth
selection for the fast method is not available. Hence, we introduce a new
MATLAB toolbox for fast multivariate kernel regression with the idea of
non-uniform FFT (NUFFT), which implemented the algorithm for $M$ gridding
points with ${O}\left( N+M\log M \right)$ complexity and accuracy
controllability. The bandwidth selection problem utilizes the Fast Monte-Carlo
algorithm to estimate the degree of freedom (DF), saving enormous
cross-validation time even better when data share the same grid space for
multiple regression. Up to now, this is the first toolbox for fast-binning
high-dimensional kernel regression. Moreover, the estimation for local
polynomial regression, the conditional variance for the heteroscedastic model,
and the complex-valued datasets are also implemented in this toolbox. The
performance is demonstrated with simulations and an application on the
quantitive EEG.
Related papers
- Suboptimality bounds for trace-bounded SDPs enable a faster and scalable low-rank SDP solver SDPLR+ [3.7507283158673212]
Semidefinite programs (SDPs) are powerful tools with many applications in machine learning and data science.
SDP solvers are challenging because by standard the positive semidefinite decision variable is an $n times n$ dense matrix.
Two decades ago, Burer and Monteiro developed an SDP solver that optimized over a low-rank factorization instead of the full matrix.
arXiv Detail & Related papers (2024-06-14T20:31:22Z) - The fast committor machine: Interpretable prediction with kernels [0.0]
This paper introduces an efficient algorithm for approximating the committor, called the "fast committor machine" (FCM)
The kernel function is constructed to emphasize low-dimensional subspaces that optimally describe the $A$ to $B$ transitions.
The FCM yields higher accuracy and trains more quickly than a neural network with the same number of parameters.
arXiv Detail & Related papers (2024-05-16T19:22:49Z) - A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z) - Globally Convergent Accelerated Algorithms for Multilinear Sparse
Logistic Regression with $\ell_0$-constraints [2.323238724742687]
Multilinear logistic regression serves as a powerful tool for the analysis of multidimensional data.
We propose an Accelerated Proximal Alternating Minim-MLSR model to solve the $ell_0$-MLSR.
We also demonstrate that APALM$+$ is globally convergent to a first-order critical point as well as to establish convergence by using the Kurdy-Lojasiewicz property.
arXiv Detail & Related papers (2023-09-17T11:05:08Z) - Simulation-free Schr\"odinger bridges via score and flow matching [89.4231207928885]
We present simulation-free score and flow matching ([SF]$2$M)
Our method generalizes both the score-matching loss used in the training of diffusion models and the recently proposed flow matching loss used in the training of continuous flows.
Notably, [SF]$2$M is the first method to accurately model cell dynamics in high dimensions and can recover known gene regulatory networks simulated data.
arXiv Detail & Related papers (2023-07-07T15:42:35Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Efficient and robust high-dimensional sparse logistic regression via
nonlinear primal-dual hybrid gradient algorithms [0.0]
We propose an iterative algorithm that provably computes a solution to a logistic regression problem regularized by an elastic net penalty.
This result improves on the known complexity bound of $O(min(m2n,mn2)log (1/epsilon))$ for first-order optimization methods.
arXiv Detail & Related papers (2021-11-30T14:16:48Z) - Asynchronous Stochastic Optimization Robust to Arbitrary Delays [54.61797739710608]
We consider optimization with delayed gradients where, at each time stept$, the algorithm makes an update using a stale computation - d_t$ for arbitrary delay $d_t gradient.
Our experiments demonstrate the efficacy and robustness of our algorithm in cases where the delay distribution is skewed or heavy-tailed.
arXiv Detail & Related papers (2021-06-22T15:50:45Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z) - Fractional ridge regression: a fast, interpretable reparameterization of
ridge regression [0.0]
Ridge regression (RR) is a regularization technique that penalizes the L2-norm of the coefficients in linear regression.
We provide an algorithm to solve FRR, as well as open-source software implementations in Python.
arXiv Detail & Related papers (2020-05-07T03:12:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.