Transfer Operators from Batches of Unpaired Points via Entropic
Transport Kernels
- URL: http://arxiv.org/abs/2402.08425v1
- Date: Tue, 13 Feb 2024 12:52:41 GMT
- Title: Transfer Operators from Batches of Unpaired Points via Entropic
Transport Kernels
- Authors: Florian Beier, Hancheng Bi, Cl\'ement Sarrazin, Bernhard Schmitzer,
Gabriele Steidl
- Abstract summary: We derive a maximum-likelihood inference functional, propose a computationally tractable approximation and analyze their properties.
We prove a $Gamma$-convergence result showing that we can recover the true density from empirical approximations as the number $N$ of blocks goes to infinity.
- Score: 3.099885205621181
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we are concerned with estimating the joint probability of
random variables $X$ and $Y$, given $N$ independent observation blocks
$(\boldsymbol{x}^i,\boldsymbol{y}^i)$, $i=1,\ldots,N$, each of $M$ samples
$(\boldsymbol{x}^i,\boldsymbol{y}^i) = \bigl((x^i_j, y^i_{\sigma^i(j)})
\bigr)_{j=1}^M$, where $\sigma^i$ denotes an unknown permutation of i.i.d.
sampled pairs $(x^i_j,y_j^i)$, $j=1,\ldots,M$. This means that the internal
ordering of the $M$ samples within an observation block is not known. We derive
a maximum-likelihood inference functional, propose a computationally tractable
approximation and analyze their properties. In particular, we prove a
$\Gamma$-convergence result showing that we can recover the true density from
empirical approximations as the number $N$ of blocks goes to infinity. Using
entropic optimal transport kernels, we model a class of hypothesis spaces of
density functions over which the inference functional can be minimized. This
hypothesis class is particularly suited for approximate inference of transfer
operators from data. We solve the resulting discrete minimization problem by a
modification of the EMML algorithm to take addional transition probability
constraints into account and prove the convergence of this algorithm.
Proof-of-concept examples demonstrate the potential of our method.
Related papers
- Testing Dependency of Unlabeled Databases [5.384630221560811]
Two random databases $mathsfXinmathcalXntimes d$ and $mathsfYinmathcalYntimes d$ are statistically dependent or not.
We characterize the thresholds at which optimal testing is information-theoretically impossible and possible.
arXiv Detail & Related papers (2023-11-10T05:17:03Z) - Estimation and Inference in Distributional Reinforcement Learning [28.253677740976197]
We show that a dataset of size $widetilde Oleft(frac|mathcalS||mathcalA|epsilon2 (1-gamma)4right)$ suffices to ensure the Kolmogorov metric and total variation metric between $hatetapi$ and $etapi$ is below $epsilon$ with high probability.
Our findings give rise to a unified approach to statistical inference of a wide class of statistical functionals of $etapi$.
arXiv Detail & Related papers (2023-09-29T14:14:53Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - From Smooth Wasserstein Distance to Dual Sobolev Norm: Empirical
Approximation and Statistical Applications [18.618590805279187]
We show that $mathsfW_p(sigma)$ is controlled by a $pth order smooth dual Sobolev $mathsfd_p(sigma)$.
We derive the limit distribution of $sqrtnmathsfd_p(sigma)(hatmu_n,mu)$ in all dimensions $d$, when $mu$ is sub-Gaussian.
arXiv Detail & Related papers (2021-01-11T17:23:24Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Learning and Sampling of Atomic Interventions from Observations [11.522442415989818]
We study the problem of efficiently estimating the effect of an intervention on a single variable (atomic interventions) using observational samples in a causal Bayesian network.
Our goal is to give algorithms that are efficient in both time and sample complexity in a non-parametric setting.
arXiv Detail & Related papers (2020-02-11T07:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.