Related papers: Distribution Regression with Sliced Wasserstein Kernels

Distribution Regression with Sliced Wasserstein Kernels

URL: http://arxiv.org/abs/2202.03926v1
Date: Tue, 8 Feb 2022 15:21:56 GMT
Title: Distribution Regression with Sliced Wasserstein Kernels
Authors: Dimitri Meunier, Massimiliano Pontil and Carlo Ciliberto
Abstract summary: We propose the first OT-based estimator for distribution regression. We study the theoretical properties of a kernel ridge regression estimator based on such representation.
Score: 45.916342378789174
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The problem of learning functions over spaces of probabilities - or distribution regression - is gaining significant interest in the machine learning community. A key challenge behind this problem is to identify a suitable representation capturing all relevant properties of the underlying functional mapping. A principled approach to distribution regression is provided by kernel mean embeddings, which lifts kernel-induced similarity on the input domain at the probability level. This strategy effectively tackles the two-stage sampling nature of the problem, enabling one to derive estimators with strong statistical guarantees, such as universal consistency and excess risk bounds. However, kernel mean embeddings implicitly hinge on the maximum mean discrepancy (MMD), a metric on probabilities, which may fail to capture key geometrical relations between distributions. In contrast, optimal transport (OT) metrics, are potentially more appealing, as documented by the recent literature on the topic. In this work, we propose the first OT-based estimator for distribution regression. We build on the Sliced Wasserstein distance to obtain an OT-based representation. We study the theoretical properties of a kernel ridge regression estimator based on such representation, for which we prove universal consistency and excess risk bounds. Preliminary experiments complement our theoretical findings by showing the effectiveness of the proposed approach and compare it with MMD-based estimators.

Related papers

Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings [24.07815507403025]
Estimating the distribution of outcomes under counterfactual policies is critical for decision-making in domains such as recommendation, advertising, and healthcare.<n>We analyze a novel framework-Counterfactual Policy Mean Embedding (CPME)-that represents the entire counterfactual outcome distribution in a reproducing kernel Hilbert space.
arXiv Detail & Related papers (2025-06-03T12:16:46Z)
Risk Bounds For Distributional Regression [9.92024586772767]
General upper bounds are established for the continuous ranked score (CRPS) and the worst-case mean squared error (MSE) across the domain.<n>Experiments on both simulated and real data validate the theoretical contributions, demonstrating their practical effectiveness.
arXiv Detail & Related papers (2025-05-14T02:22:12Z)
Robust Estimation for Kernel Exponential Families with Smoothed Total Variation Distances [2.317910166616341]
In statistical inference, we commonly assume that samples are independent and identically distributed from a probability distribution. In this paper, we explore the application of GAN-like estimators to a general class of statistical models.
arXiv Detail & Related papers (2024-10-28T05:50:47Z)
Domain Generalization with Small Data [27.040070085669086]
We learn a domain-invariant representation based on the probabilistic framework by mapping each data point into probabilistic embeddings. Our proposed method can marriage the measurement on the textitdistribution over distributions (i.e., the global perspective alignment) and the distribution-based contrastive semantic alignment.
arXiv Detail & Related papers (2024-02-09T02:59:08Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Nonparametric logistic regression with deep learning [1.0589208420411012]
In the nonparametric logistic regression, the Kullback-Leibler divergence could diverge easily. Instead of analyzing the excess risk itself, it suffices to show the consistency of the maximum likelihood estimator. As an important application, we derive convergence rates of the NPMLE with fully connected deep neural networks.
arXiv Detail & Related papers (2024-01-23T04:31:49Z)
Distributionally Robust Skeleton Learning of Discrete Bayesian Networks [9.46389554092506]
We consider the problem of learning the exact skeleton of general discrete Bayesian networks from potentially corrupted data. We propose to optimize the most adverse risk over a family of distributions within bounded Wasserstein distance or KL divergence to the empirical distribution. We present efficient algorithms and show the proposed methods are closely related to the standard regularized regression approach.
arXiv Detail & Related papers (2023-11-10T15:33:19Z)
Online Bootstrap Inference with Nonconvex Stochastic Gradient Descent Estimator [0.0]
In this paper, we investigate the theoretical properties of gradient descent (SGD) for statistical inference in the context of convex problems. We propose two coferential procedures which may contain multiple error minima.
arXiv Detail & Related papers (2023-06-03T22:08:10Z)
Robust computation of optimal transport by $\eta$-potential regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions. We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence. We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z)
Robust Estimation for Nonparametric Families via Generative Adversarial Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems. Our work extend these to robust mean estimation, second moment estimation, and robust linear regression. In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z)
Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z)
Keep it Tighter -- A Story on Analytical Mean Embeddings [0.6445605125467574]
Kernel techniques are among the most popular and flexible approaches in data science. Mean embedding gives rise to a divergence measure referred to as maximum mean discrepancy (MMD) In this paper we focus on the problem of MMD estimation when the mean embedding of one of the underlying distributions is available analytically.
arXiv Detail & Related papers (2021-10-15T21:29:27Z)
General stochastic separation theorems with optimal bounds [68.8204255655161]
Phenomenon of separability was revealed and used in machine learning to correct errors of Artificial Intelligence (AI) systems and analyze AI instabilities. Errors or clusters of errors can be separated from the rest of the data. The ability to correct an AI system also opens up the possibility of an attack on it, and the high dimensionality induces vulnerabilities caused by the same separability.
arXiv Detail & Related papers (2020-10-11T13:12:41Z)
Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models. We provide a unifying view of these estimators under the framework of regularized nonparametric regression. We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.