Related papers: Kernel Operations on the GPU, with Autodiff, without Memory Overflows

Kernel Operations on the GPU, with Autodiff, without Memory Overflows

URL: http://arxiv.org/abs/2004.11127v2
Date: Thu, 8 Apr 2021 12:36:50 GMT
Title: Kernel Operations on the GPU, with Autodiff, without Memory Overflows
Authors: Benjamin Charlier, Jean Feydy, Joan Alexis Glaun\`es, Fran\c{c}ois-David Collin, Ghislain Durif
Abstract summary: The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula. KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and R.
Score: 5.669790037378094
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula, such as kernel and distance matrices. KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. It also supports automatic differentiation and outperforms standard GPU baselines, including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and GNU R. As a result, high-level "quadratic" codes can now scale up to large data sets with millions of samples processed in seconds. KeOps brings graphics-like performances for kernel methods and is freely available on standard repositories (PyPi, CRAN). To showcase its versatility, we provide tutorials in a wide range of settings online at \url{www.kernel-operations.io}.

Related papers

A User's Guide to $\texttt{KSig}$: GPU-Accelerated Computation of the Signature Kernel [12.111848705677138]
The signature kernel is a positive definite kernel for sequential and temporal data. In this chapter, we give a short introduction to $textttKSig$, a $textttScikit-Learn$ compatible Python package that implements various GPU-accelerated algorithms for computing signature kernels.
arXiv Detail & Related papers (2025-01-13T09:11:13Z)
Explore as a Storm, Exploit as a Raindrop: On the Benefit of Fine-Tuning Kernel Schedulers with Coordinate Descent [48.791943145735]
We show the potential to reduce Ansor's search time while enhancing kernel quality. We apply this approach to the first 300 kernels that Ansor generates. This result has been replicated in 20 well-known deep-learning models.
arXiv Detail & Related papers (2024-06-28T16:34:22Z)
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations [1.3030767447016454]
iSpLib is a PyTorch-based C++ library equipped with auto-tuned sparse operations. We demonstrate that iSpLib obtains up to 27x overall training speedup compared to the equivalent PyTorch 2.1.0 and PyTorch Geometric 2.4.0 implementations on the CPU.
arXiv Detail & Related papers (2024-03-21T21:56:44Z)
Snacks: a fast large-scale kernel SVM solver [0.8602553195689513]
Snacks is a new large-scale solver for Kernel Support Vector Machines. Snacks relies on a Nystr"om approximation of the kernel matrix and an accelerated variant of the subgradient method.
arXiv Detail & Related papers (2023-04-17T04:19:20Z)
PLSSVM: A (multi-)GPGPU-accelerated Least Squares Support Vector Machine [68.8204255655161]
Support Vector Machines (SVMs) are widely used in machine learning. However, even modern and optimized implementations do not scale well for large non-trivial dense data sets on cutting-edge hardware. PLSSVM can be used as a drop-in replacement for an LVM.
arXiv Detail & Related papers (2022-02-25T13:24:23Z)
Giga-scale Kernel Matrix Vector Multiplication on GPU [9.106412307976067]
Kernel matrix vector multiplication (KMVM) is a ubiquitous operation in machine learning and scientific computing, spanning from the kernel literature to signal processing. We propose a novel approximation procedure coined Faster-Fast and Free Memory Method ($textF3$M) to address these scaling issues for KMVM. We show that $textF3$M can compute a full KMVM for a billion points emphin under one minute on a high-end GPU, leading to a significant speed-up in comparison to existing CPU methods.
arXiv Detail & Related papers (2022-02-02T15:28:15Z)
TensorLy-Quantum: Quantum Machine Learning with Tensor Methods [67.29221827422164]
We create a Python library for quantum circuit simulation that adopts the PyTorch API. Ly-Quantum can scale to hundreds of qubits on a single GPU and thousands of qubits on multiple GPU.
arXiv Detail & Related papers (2021-12-19T19:26:17Z)
VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator. textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z)
Efficient Graph Deep Learning in TensorFlow with tf_geometric [53.237754811019464]
We introduce tf_geometric, an efficient and friendly library for graph deep learning. tf_geometric provides kernel libraries for building Graph Neural Networks (GNNs) as well as implementations of popular GNNs. The kernel libraries consist of infrastructures for building efficient GNNs, including graph data structures, graph map-reduce framework, graph mini-batch strategy, etc.
arXiv Detail & Related papers (2021-01-27T17:16:36Z)
Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)
Kernel methods library for pattern analysis and machine learning in python [0.0]
The kernelmethods library fills that important void in the python ML ecosystem in a domain-agnostic fashion. The library provides a number of well-defined classes to make various kernel-based operations efficient.
arXiv Detail & Related papers (2020-05-27T16:44:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.