Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition
- URL: http://arxiv.org/abs/2502.09047v1
- Date: Thu, 13 Feb 2025 08:02:15 GMT
- Title: Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition
- Authors: Yuanshi Liu, Haihan Zhang, Qian Chen, Cong Fang,
- Abstract summary: A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution.
This paper studies the foundational (high-dimensional) linear regression where the ground truth variables are confined to an ellipse-shape constraint.
- Score: 7.300356389675634
- License:
- Abstract: A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution (OOD). In theory, the challenge remains unsolved even under the canonical setting of covariate shift for the linear model. This paper studies the foundational (high-dimensional) linear regression where the ground truth variables are confined to an ellipse-shape constraint and addresses two fundamental questions in this regime: (i) given the target covariate matrix, what is the min-max \emph{optimal} algorithm under covariate shift? (ii) for what kinds of target classes, the commonly-used SGD-type algorithms achieve optimality? Our analysis starts with establishing a tight lower generalization bound via a Bayesian Cramer-Rao inequality. For (i), we prove that the optimal estimator can be simply a certain linear transformation of the best estimator for the source distribution. Given the source and target matrices, we show that the transformation can be efficiently computed via a convex program. The min-max optimal analysis for SGD leverages the idea that we recognize both the accumulated updates of the applied algorithms and the ideal transformation as preconditions on the learning variables. We provide sufficient conditions when SGD with its acceleration variants attain optimality.
Related papers
- Maximum Likelihood Estimation is All You Need for Well-Specified
Covariate Shift [34.414261291690856]
Key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization.
We show that classical Maximum Likelihood Estimation (MLE) purely using source data achieves the minimax optimality.
We illustrate the wide applicability of our framework by instantiating it to three concrete examples.
arXiv Detail & Related papers (2023-11-27T16:06:48Z) - Generative Models for Anomaly Detection and Design-Space Dimensionality
Reduction in Shape Optimization [0.0]
Our work presents a novel approach to shape optimization, with the twofold objective to improve the efficiency of global algorithms and to promote the generation of high-quality designs.
This is accomplished by reducing the number of the original design variables defining a new reduced subspace where the geometrical variance is maximized.
From the numerical results, the new framework improves the convergence of global optimization algorithms, while only designs with high-quality geometrical features are generated.
arXiv Detail & Related papers (2023-08-08T04:57:58Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Piecewise linear regression and classification [0.20305676256390928]
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors.
A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/bemporad/parc.
arXiv Detail & Related papers (2021-03-10T17:07:57Z) - Mixed-Projection Conic Optimization: A New Paradigm for Modeling Rank
Constraints [3.179831861897336]
We provide a framework for solving low-rank optimization problems to certifiable optimality.
Our framework also provides near-optimal solutions through rounding and local search techniques.
arXiv Detail & Related papers (2020-09-22T08:59:06Z) - Bayesian Sparse learning with preconditioned stochastic gradient MCMC
and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions.
We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.