Kernel Operations on the GPU, with Autodiff, without Memory Overflows
- URL: http://arxiv.org/abs/2004.11127v2
- Date: Thu, 8 Apr 2021 12:36:50 GMT
- Title: Kernel Operations on the GPU, with Autodiff, without Memory Overflows
- Authors: Benjamin Charlier, Jean Feydy, Joan Alexis Glaun\`es,
Fran\c{c}ois-David Collin, Ghislain Durif
- Abstract summary: The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula.
KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption.
KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and R.
- Score: 5.669790037378094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The KeOps library provides a fast and memory-efficient GPU support for
tensors whose entries are given by a mathematical formula, such as kernel and
distance matrices. KeOps alleviates the major bottleneck of tensor-centric
libraries for kernel and geometric applications: memory consumption. It also
supports automatic differentiation and outperforms standard GPU baselines,
including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines
optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy
and PyTorch), Matlab and GNU R. As a result, high-level "quadratic" codes can
now scale up to large data sets with millions of samples processed in seconds.
KeOps brings graphics-like performances for kernel methods and is freely
available on standard repositories (PyPi, CRAN). To showcase its versatility,
we provide tutorials in a wide range of settings online at
\url{www.kernel-operations.io}.
Related papers
- Explore as a Storm, Exploit as a Raindrop: On the Benefit of Fine-Tuning Kernel Schedulers with Coordinate Descent [48.791943145735]
We show the potential to reduce Ansor's search time while enhancing kernel quality.
We apply this approach to the first 300 kernels that Ansor generates.
This result has been replicated in 20 well-known deep-learning models.
arXiv Detail & Related papers (2024-06-28T16:34:22Z) - iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations [1.3030767447016454]
iSpLib is a PyTorch-based C++ library equipped with auto-tuned sparse operations.
We demonstrate that iSpLib obtains up to 27x overall training speedup compared to the equivalent PyTorch 2.1.0 and PyTorch Geometric 2.4.0 implementations on the CPU.
arXiv Detail & Related papers (2024-03-21T21:56:44Z) - Snacks: a fast large-scale kernel SVM solver [0.8602553195689513]
Snacks is a new large-scale solver for Kernel Support Vector Machines.
Snacks relies on a Nystr"om approximation of the kernel matrix and an accelerated variant of the subgradient method.
arXiv Detail & Related papers (2023-04-17T04:19:20Z) - PLSSVM: A (multi-)GPGPU-accelerated Least Squares Support Vector Machine [68.8204255655161]
Support Vector Machines (SVMs) are widely used in machine learning.
However, even modern and optimized implementations do not scale well for large non-trivial dense data sets on cutting-edge hardware.
PLSSVM can be used as a drop-in replacement for an LVM.
arXiv Detail & Related papers (2022-02-25T13:24:23Z) - Giga-scale Kernel Matrix Vector Multiplication on GPU [9.106412307976067]
Kernel matrix vector multiplication (KMVM) is a ubiquitous operation in machine learning and scientific computing, spanning from the kernel literature to signal processing.
We propose a novel approximation procedure coined Faster-Fast and Free Memory Method ($textF3$M) to address these scaling issues for KMVM.
We show that $textF3$M can compute a full KMVM for a billion points emphin under one minute on a high-end GPU, leading to a significant speed-up in comparison to existing CPU methods.
arXiv Detail & Related papers (2022-02-02T15:28:15Z) - TensorLy-Quantum: Quantum Machine Learning with Tensor Methods [67.29221827422164]
We create a Python library for quantum circuit simulation that adopts the PyTorch API.
Ly-Quantum can scale to hundreds of qubits on a single GPU and thousands of qubits on multiple GPU.
arXiv Detail & Related papers (2021-12-19T19:26:17Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - Efficient Graph Deep Learning in TensorFlow with tf_geometric [53.237754811019464]
We introduce tf_geometric, an efficient and friendly library for graph deep learning.
tf_geometric provides kernel libraries for building Graph Neural Networks (GNNs) as well as implementations of popular GNNs.
The kernel libraries consist of infrastructures for building efficient GNNs, including graph data structures, graph map-reduce framework, graph mini-batch strategy, etc.
arXiv Detail & Related papers (2021-01-27T17:16:36Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - Kernel methods library for pattern analysis and machine learning in
python [0.0]
The kernelmethods library fills that important void in the python ML ecosystem in a domain-agnostic fashion.
The library provides a number of well-defined classes to make various kernel-based operations efficient.
arXiv Detail & Related papers (2020-05-27T16:44:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.