Wasserstein Exponential Kernels
- URL: http://arxiv.org/abs/2002.01878v1
- Date: Wed, 5 Feb 2020 17:31:56 GMT
- Title: Wasserstein Exponential Kernels
- Authors: Henri De Plaen, Micha\"el Fanuel and Johan A. K. Suykens
- Abstract summary: We study the use of exponential kernels defined thanks to the regularized Wasserstein distance.
We show that Wasserstein squared exponential kernels are shown to yield smaller classification errors on small training sets of shapes.
- Score: 13.136143245702915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the context of kernel methods, the similarity between data points is
encoded by the kernel function which is often defined thanks to the Euclidean
distance, a common example being the squared exponential kernel. Recently,
other distances relying on optimal transport theory - such as the Wasserstein
distance between probability distributions - have shown their practical
relevance for different machine learning techniques. In this paper, we study
the use of exponential kernels defined thanks to the regularized Wasserstein
distance and discuss their positive definiteness. More specifically, we define
Wasserstein feature maps and illustrate their interest for supervised learning
problems involving shapes and images. Empirically, Wasserstein squared
exponential kernels are shown to yield smaller classification errors on small
training sets of shapes, compared to analogous classifiers using Euclidean
distances.
Related papers
- Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs [0.0]
This work explores kernel methods and Wasserstein distance metrics to develop a computationally tractable clustering framework.
The framework is flexible enough to be applied in various domains, such as graph analysis and image processing.
A case study involving two datasets of 879 and 34,920 power distribution graphs demonstrates the framework's effectiveness and efficiency.
arXiv Detail & Related papers (2025-03-18T15:40:55Z) - On the Approximation of Kernel functions [0.0]
The paper addresses approximations of the kernel itself.
For the Hilbert Gauss kernel on the unit cube, the paper establishes an upper bound of the associated eigenfunctions.
This improvement confirms low rank approximation methods such as the Nystr"om method.
arXiv Detail & Related papers (2024-03-11T13:50:07Z) - Gaussian Process regression over discrete probability measures: on the
non-stationarity relation between Euclidean and Wasserstein Squared
Exponential Kernels [0.19116784879310028]
A non-stationarity relationship between the Wasserstein-based squared exponential kernel and its Euclidean-based counterpart is studied.
A transformation is used to transform the input space as Euclidean into a non-stationary and Wasserstein-based Gaussian Process model.
arXiv Detail & Related papers (2022-12-02T17:09:52Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Kernel distance measures for time series, random fields and other
structured data [71.61147615789537]
kdiff is a novel kernel-based measure for estimating distances between instances of structured data.
It accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution.
Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems.
arXiv Detail & Related papers (2021-09-29T22:54:17Z) - Depth-based pseudo-metrics between probability distributions [1.1470070927586016]
We propose two new pseudo-metrics between continuous probability measures based on data depth and its associated central regions.
In contrast to the Wasserstein distance, the proposed pseudo-metrics do not suffer from the curse of dimensionality.
The regions-based pseudo-metric appears to be robust w.r.t. both outliers and heavy tails.
arXiv Detail & Related papers (2021-03-23T17:33:18Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Learning interaction kernels in mean-field equations of 1st-order
systems of interacting particles [1.776746672434207]
We introduce a nonparametric algorithm to learn interaction kernels of mean-field equations for 1st-order systems of interacting particles.
By at least squares with regularization, the algorithm learns the kernel on data-adaptive hypothesis spaces efficiently.
arXiv Detail & Related papers (2020-10-29T15:37:17Z) - The role of feature space in atomistic learning [62.997667081978825]
Physically-inspired descriptors play a key role in the application of machine-learning techniques to atomistic simulations.
We introduce a framework to compare different sets of descriptors, and different ways of transforming them by means of metrics and kernels.
We compare representations built in terms of n-body correlations of the atom density, quantitatively assessing the information loss associated with the use of low-order features.
arXiv Detail & Related papers (2020-09-06T14:12:09Z) - Neural Operator: Graph Kernel Network for Partial Differential Equations [57.90284928158383]
This work is to generalize neural networks so that they can learn mappings between infinite-dimensional spaces (operators)
We formulate approximation of the infinite-dimensional mapping by composing nonlinear activation functions and a class of integral operators.
Experiments confirm that the proposed graph kernel network does have the desired properties and show competitive performance compared to the state of the art solvers.
arXiv Detail & Related papers (2020-03-07T01:56:20Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z) - Schoenberg-Rao distances: Entropy-based and geometry-aware statistical
Hilbert distances [12.729120803225065]
We study a class of statistical Hilbert distances that we term the Schoenberg-Rao distances.
We derive novel closed-form distances between mixtures of Gaussian distributions.
Our method constitutes a practical alternative to Wasserstein distances and we illustrate its efficiency on a broad range of machine learning tasks.
arXiv Detail & Related papers (2020-02-19T18:48:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.