Related papers: Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift

Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift

URL: http://arxiv.org/abs/2509.05106v1
Date: Fri, 05 Sep 2025 13:42:27 GMT
Title: Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift
Authors: Ren-Rui Liu, Zheng-Chu Guo,
Abstract summary: spectral algorithms are a class of regularization methods originating from inverse problems.<n>In this paper, we investigate the convergence properties of spectral algorithms under covariate shift.<n>We provide a theoretical analysis of the more challenging misspecified case, in which the target function does not belong to the kernel reproducing Hilbert space (RKHS)
Score: 0.2578242050187029
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper investigates the convergence properties of spectral algorithms -- a class of regularization methods originating from inverse problems -- under covariate shift. In this setting, the marginal distributions of inputs differ between source and target domains, while the conditional distribution of outputs given inputs remains unchanged. To address this distributional mismatch, we incorporate importance weights, defined as the ratio of target to source densities, into the learning framework. This leads to a weighted spectral algorithm within a nonparametric regression setting in a reproducing kernel Hilbert space (RKHS). More importantly, in contrast to prior work that largely focuses on the well-specified setting, we provide a comprehensive theoretical analysis of the more challenging misspecified case, in which the target function does not belong to the RKHS. Under the assumption of uniformly bounded density ratios, we establish minimax-optimal convergence rates when the target function lies within the RKHS. For scenarios involving unbounded importance weights, we introduce a novel truncation technique that attains near-optimal convergence rates under mild regularity conditions, and we further extend these results to the misspecified regime. By addressing the intertwined challenges of covariate shift and model misspecification, this work extends classical kernel learning theory to more practical scenarios, providing a systematic framework for understanding their interaction.

Related papers

Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs [55.77845440440496]
Push-based decentralized communication enables optimization over communication networks, where information exchange may be asymmetric.<n>We develop a unified uniform-stability framework for the Gradient Push (SGP) algorithm.<n>A key technical ingredient is an imbalance-aware generalization bound through two quantities.
arXiv Detail & Related papers (2026-02-24T05:32:03Z)
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations [57.179679246370114]
We identify the distribution of random perturbations that minimizes the estimator's variance as the perturbation stepsize tends to zero.<n>Our findings reveal that such desired perturbations can align directionally with the true gradient, instead of maintaining a fixed length.
arXiv Detail & Related papers (2025-10-22T19:06:39Z)
A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives [17.212481754312048]
Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised.<n>This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target.<n>We introduce a novel measure of suboptimality called 'gradient discrepancy', and in particular a'Kernel gradient discrepancy' that can be explicitly computed.
arXiv Detail & Related papers (2025-09-12T16:38:41Z)
Spectral Algorithms under Covariate Shift [4.349399061959293]
Spectral algorithms leverage spectral regularization techniques to analyze and process data.<n>We investigate the convergence behavior of spectral algorithms in situations where the distributions of training and test data may differ.<n>We propose a novel weighted spectral algorithm with normalized weights that incorporates density ratio information into the learning process.
arXiv Detail & Related papers (2025-04-17T04:02:06Z)
Stochastic Optimization with Optimal Importance Sampling [49.484190237840714]
We propose an iterative-based algorithm that jointly updates the decision and the IS distribution without requiring time-scale separation between the two.<n>Our method achieves the lowest possible variable variance and guarantees global convergence under convexity of the objective and mild assumptions on the IS distribution family.
arXiv Detail & Related papers (2025-04-04T16:10:18Z)
Distributionally Robust Optimization via Iterative Algorithms in Continuous Probability Spaces [6.992239210938067]
We consider a minimax problem motivated by distributionally robust optimization (DRO) when the worst-case distribution is continuous.<n>Recent research has explored learning the worst-case distribution using neural network-based generative networks.<n>This paper bridges this theoretical challenge by presenting an iterative algorithm to solve such a minimax problem.
arXiv Detail & Related papers (2024-12-29T19:31:23Z)
Variance-Reducing Couplings for Random Features [57.73648780299374]
Random features (RFs) are a popular technique to scale up kernel methods in machine learning. We find couplings to improve RFs defined on both Euclidean and discrete input spaces. We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm.
arXiv Detail & Related papers (2024-05-26T12:25:09Z)
Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution. We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z)
Distributed Stochastic Optimization under a General Variance Condition [13.911633636387059]
Distributed optimization has drawn great attention recently due to its effectiveness in solving largescale machine learning problems. We revisit the classical Federated Averaging (Avg) and establish the convergence results under only a mild variance for smooth non objective functions. Almost a stationary convergence point is also established under the gradients condition.
arXiv Detail & Related papers (2023-01-30T05:48:09Z)
Robust Estimation for Nonparametric Families via Generative Adversarial Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems. Our work extend these to robust mean estimation, second moment estimation, and robust linear regression. In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.