A Regression Approach to Learning-Augmented Online Algorithms
- URL: http://arxiv.org/abs/2205.08717v1
- Date: Wed, 18 May 2022 04:29:14 GMT
- Title: A Regression Approach to Learning-Augmented Online Algorithms
- Authors: Keerti Anand, Rong Ge, Amit Kumar, Debmalya Panigrahi
- Abstract summary: We introduce this approach in this paper, and explore it in the context of a general online search framework.
We show nearly tight bounds on the sample complexity of this regression problem, and extend our results to the agnostic setting.
From a technical standpoint, we show that the key is to incorporate online optimization benchmarks in the design of the loss function for the regression problem.
- Score: 17.803569868141647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emerging field of learning-augmented online algorithms uses ML techniques
to predict future input parameters and thereby improve the performance of
online algorithms. Since these parameters are, in general, real-valued
functions, a natural approach is to use regression techniques to make these
predictions. We introduce this approach in this paper, and explore it in the
context of a general online search framework that captures classic problems
like (generalized) ski rental, bin packing, minimum makespan scheduling, etc.
We show nearly tight bounds on the sample complexity of this regression
problem, and extend our results to the agnostic setting. From a technical
standpoint, we show that the key is to incorporate online optimization
benchmarks in the design of the loss function for the regression problem,
thereby diverging from the use of off-the-shelf regression tools with standard
bounds on statistical error.
Related papers
- Mutual Information Learned Regressor: an Information-theoretic Viewpoint
of Training Regression Systems [10.314518385506007]
An existing common practice for solving regression problems is the mean square error (MSE) minimization approach.
Recently, Yi et al., proposed a mutual information based supervised learning framework where they introduced a label entropy regularization.
In this paper, we investigate the regression under the mutual information based supervised learning framework.
arXiv Detail & Related papers (2022-11-23T03:43:22Z) - Vector-Valued Least-Squares Regression under Output Regularity
Assumptions [73.99064151691597]
We propose and analyse a reduced-rank method for solving least-squares regression problems with infinite dimensional output.
We derive learning bounds for our method, and study under which setting statistical performance is improved in comparison to full-rank method.
arXiv Detail & Related papers (2022-11-16T15:07:00Z) - A Novel Plug-and-Play Approach for Adversarially Robust Generalization [38.72514422694518]
We propose a robust framework that employs adversarially robust training to safeguard the ML models against perturbed testing data.
Our contributions can be seen from both computational and statistical perspectives.
arXiv Detail & Related papers (2022-08-19T17:02:55Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Stochastic Online Linear Regression: the Forward Algorithm to Replace
Ridge [24.880035784304834]
We derive high probability regret bounds for online ridge regression and the forward algorithm.
This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions.
arXiv Detail & Related papers (2021-11-02T13:57:53Z) - Logistic Q-Learning [87.00813469969167]
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs.
The main feature of our algorithm is a convex loss function for policy evaluation that serves as a theoretically sound alternative to the widely used squared Bellman error.
arXiv Detail & Related papers (2020-10-21T17:14:31Z) - A spectral algorithm for robust regression with subgaussian rates [0.0]
We study a new linear up to quadratic time algorithm for linear regression in the absence of strong assumptions on the underlying distributions of samples.
The goal is to design a procedure which attains the optimal sub-gaussian error bound even though the data have only finite moments.
arXiv Detail & Related papers (2020-07-12T19:33:50Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Optimizing for the Future in Non-Stationary MDPs [52.373873622008944]
We present a policy gradient algorithm that maximizes a forecast of future performance.
We show that our algorithm, called Prognosticator, is more robust to non-stationarity than two online adaptation techniques.
arXiv Detail & Related papers (2020-05-17T03:41:19Z) - Beyond UCB: Optimal and Efficient Contextual Bandits with Regression
Oracles [112.89548995091182]
We provide the first universal and optimal reduction from contextual bandits to online regression.
Our algorithm requires no distributional assumptions beyond realizability, and works even when contexts are chosen adversarially.
arXiv Detail & Related papers (2020-02-12T11:33:46Z) - Adaptive Approximate Policy Iteration [22.915651391812187]
We present a learning scheme which enjoys a $tildeO(T2/3)$ regret bound for undiscounted, continuing learning in uniformly ergodic MDPs.
This is an improvement over the best existing bound of $tildeO(T3/4)$ for the average-reward case with function approximation.
arXiv Detail & Related papers (2020-02-08T02:27:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.