Related papers: Improved learning theory for kernel distribution regression with two-stage sampling

Improved learning theory for kernel distribution regression with two-stage sampling

URL: http://arxiv.org/abs/2308.14335v1
Date: Mon, 28 Aug 2023 06:29:09 GMT
Title: Improved learning theory for kernel distribution regression with two-stage sampling
Authors: Fran\c{c}ois Bachoc and Louis B\'ethune and Alberto Gonz\'alez-Sanz and Jean-Michel Loubes
Abstract summary: kernel methods have become a method of choice to tackle the distribution regression problem. We introduce the novel near-unbiased condition on the Hilbertian embeddings, that enables us to provide new error bounds. We show that this near-unbiased condition holds for three important classes of kernels, based on optimal transport and mean embedding.
Score: 3.154269505086155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The distribution regression problem encompasses many important statistics and machine learning tasks, and arises in a large range of applications. Among various existing approaches to tackle this problem, kernel methods have become a method of choice. Indeed, kernel distribution regression is both computationally favorable, and supported by a recent learning theory. This theory also tackles the two-stage sampling setting, where only samples from the input distributions are available. In this paper, we improve the learning theory of kernel distribution regression. We address kernels based on Hilbertian embeddings, that encompass most, if not all, of the existing approaches. We introduce the novel near-unbiased condition on the Hilbertian embeddings, that enables us to provide new error bounds on the effect of the two-stage sampling, thanks to a new analysis. We show that this near-unbiased condition holds for three important classes of kernels, based on optimal transport and mean embedding. As a consequence, we strictly improve the existing convergence rates for these kernels. Our setting and results are illustrated by numerical experiments.

Related papers

Statistical Learning Theory for Distributional Classification [3.231986804142224]
In supervised learning with distributional inputs, the inputs are not accessible in the learning phase, but only samples thereof.<n>This problem is particularly amenable to kernel-based learning methods, where the distributions or samples are first embedded into a Hilbert space.<n>We contribute to the theoretical analysis of this latter approach, with a particular focus on classification with distributional inputs using SVMs.
arXiv Detail & Related papers (2026-01-21T09:44:24Z)
Mirror Descent on Reproducing Kernel Banach Spaces [12.716091600034543]
This paper addresses a learning problem on Banach spaces endowed with a reproducing kernel. We propose an algorithm that employs gradient steps in the dual space of the Banach space using the reproducing kernel. To instantiate this algorithm in practice, we introduce a novel family of RKBSs with $p$-norm.
arXiv Detail & Related papers (2024-11-18T02:18:32Z)
Learning to Embed Distributions via Maximum Kernel Entropy [0.0]
Emprimiical data can often be considered as samples from a set of probability distributions. Kernel methods have emerged as a natural approach for learning to classify these distributions. We propose a novel objective for the unsupervised learning of data-dependent distribution kernel.
arXiv Detail & Related papers (2024-08-01T13:34:19Z)
Bayesian Circular Regression with von Mises Quasi-Processes [57.88921637944379]
In this work we explore a family of expressive and interpretable distributions over circle-valued random functions. For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Gibbs sampling. We present experiments applying this model to the prediction of wind directions and the percentage of the running gait cycle as a function of joint angles.
arXiv Detail & Related papers (2024-06-19T01:57:21Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Variational Autoencoder Kernel Interpretation and Selection for Classification [59.30734371401315]
This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder. In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel. choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels.
arXiv Detail & Related papers (2022-09-10T17:22:53Z)
Coefficient-based Regularized Distribution Regression [4.21768682940933]
We consider the coefficient-based regularized distribution regression which aims to regress from probability measures to real-valued responses over a kernel reproducing Hilbert space (RKHS) Asymptotic behaviors of the algorithm in different regularity ranges of the regression function are comprehensively studied. We get the optimal rates under some mild conditions, which matches the one-stage sampled minimax optimal rate.
arXiv Detail & Related papers (2022-08-26T03:46:14Z)
Self-supervised learning with rotation-invariant kernels [4.059849656394191]
We propose a general kernel framework to design a generic regularization loss that promotes the embedding distribution to be close to the uniform distribution on the hypersphere. Our framework uses rotation-invariant kernels defined on the hypersphere, also known as dot-product kernels. Our experiments demonstrate that using a truncated rotation-invariant kernel provides competitive results compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-07-28T08:06:24Z)
On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods. We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z)
A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense. We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
Optimal Rates of Distributed Regression with Imperfect Kernels [0.0]
We study the distributed kernel regression via the divide conquer and conquer approach. We show that the kernel ridge regression can achieve rates faster than $N-1$ in the noise free setting.
arXiv Detail & Related papers (2020-06-30T13:00:16Z)
Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution. Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.