Covariate shift in nonparametric regression with Markovian design
- URL: http://arxiv.org/abs/2307.08517v1
- Date: Mon, 17 Jul 2023 14:24:27 GMT
- Title: Covariate shift in nonparametric regression with Markovian design
- Authors: Lukas Trottner
- Abstract summary: We show that convergence rates for a smoothness risk of a Nadaraya-Watson kernel estimator are determined by the similarity between the invariant distributions associated to source and target Markov chains.
We extend the notion of a distribution exponent from Kpotufe and Martinet to kernel transfer exponents of uniformly ergodic Markov chains.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Covariate shift in regression problems and the associated distribution
mismatch between training and test data is a commonly encountered phenomenon in
machine learning. In this paper, we extend recent results on nonparametric
convergence rates for i.i.d. data to Markovian dependence structures. We
demonstrate that under H\"older smoothness assumptions on the regression
function, convergence rates for the generalization risk of a Nadaraya-Watson
kernel estimator are determined by the similarity between the invariant
distributions associated to source and target Markov chains. The similarity is
explicitly captured in terms of a bandwidth-dependent similarity measure
recently introduced in Pathak, Ma and Wainwright [ICML, 2022]. Precise
convergence rates are derived for the particular cases of finite Markov chains
and spectral gap Markov chains for which the similarity measure between their
invariant distributions grows polynomially with decreasing bandwidth. For the
latter, we extend the notion of a distribution transfer exponent from Kpotufe
and Martinet [Ann. Stat., 49(6), 2021] to kernel transfer exponents of
uniformly ergodic Markov chains in order to generate a rich class of Markov
kernel pairs for which convergence guarantees for the covariate shift problem
can be formulated.
Related papers
- Ai-Sampler: Adversarial Learning of Markov kernels with involutive maps [28.229819253644862]
We propose a method to parameterize and train transition kernels of Markov chains to achieve efficient sampling and good mixing.
This training procedure minimizes the total variation distance between the stationary distribution of the chain and the empirical distribution of the data.
arXiv Detail & Related papers (2024-06-04T17:00:14Z) - Variance-Reducing Couplings for Random Features [57.73648780299374]
Random features (RFs) are a popular technique to scale up kernel methods in machine learning.
We find couplings to improve RFs defined on both Euclidean and discrete input spaces.
We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm.
arXiv Detail & Related papers (2024-05-26T12:25:09Z) - Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting [64.0722630873758]
We consider rather general and broad class of Markov chains, Ito chains, that look like Euler-Maryama discretization of some Differential Equation.
We prove the bound in $W_2$-distance between the laws of our Ito chain and differential equation.
arXiv Detail & Related papers (2023-10-09T18:38:56Z) - Self-Repellent Random Walks on General Graphs -- Achieving Minimal
Sampling Variance via Nonlinear Markov Chains [11.3631620309434]
We consider random walks on discrete state spaces, such as general undirected graphs, where the random walkers are designed to approximate a target quantity over the network topology via sampling and neighborhood exploration.
Given any Markov chain corresponding to a target probability distribution, we design a self-repellent random walk (SRRW) which is less likely to transition to nodes that were highly visited in the past, and more likely to transition to seldom visited nodes.
For a class of SRRWs parameterized by a positive real alpha, we prove that the empirical distribution of the process converges almost surely to the the target (
arXiv Detail & Related papers (2023-05-08T23:59:09Z) - Rosenthal-type inequalities for linear statistics of Markov chains [20.606986885851573]
We establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains.
We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain.
arXiv Detail & Related papers (2023-03-10T10:24:46Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - On the Kullback-Leibler divergence between pairwise isotropic
Gaussian-Markov random fields [93.35534658875731]
We derive expressions for the Kullback-Leibler divergence between two pairwise isotropic Gaussian-Markov random fields.
The proposed equation allows the development of novel similarity measures in image processing and machine learning applications.
arXiv Detail & Related papers (2022-03-24T16:37:24Z) - A Unified Joint Maximum Mean Discrepancy for Domain Adaptation [73.44809425486767]
This paper theoretically derives a unified form of JMMD that is easy to optimize.
From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence that benefits to classification.
We propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift.
arXiv Detail & Related papers (2021-01-25T09:46:14Z) - Concentration inequality for U-statistics of order two for uniformly
ergodic Markov chains [0.0]
We prove a concentration inequality for U-statistics of order two for uniformly ergodic Markov chains.
We show that we can recover the convergence rate of Arcones and Gin'e who proved a concentration result for U-statistics of independent random variables and canonical kernels.
arXiv Detail & Related papers (2020-11-20T15:14:34Z) - MCMC-Interactive Variational Inference [56.58416764959414]
We propose MCMC-interactive variational inference (MIVI) to estimate the posterior in a time constrained manner.
MIVI takes advantage of the complementary properties of variational inference and MCMC to encourage mutual improvement.
Experiments show that MIVI not only accurately approximates the posteriors but also facilitates designs of gradient MCMC and Gibbs sampling transitions.
arXiv Detail & Related papers (2020-10-02T17:43:20Z) - Convergence of Recursive Stochastic Algorithms using Wasserstein
Divergence [4.688616907736838]
We show that convergence of a large family of constant stepsize RSAs can be understood using this framework.
We show that convergence of a large family of constant stepsize RSAs can be understood using this framework.
arXiv Detail & Related papers (2020-03-25T13:45:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.