Online Learning via Offline Greedy Algorithms: Applications in Market
Design and Optimization
- URL: http://arxiv.org/abs/2102.11050v1
- Date: Thu, 18 Feb 2021 19:05:26 GMT
- Title: Online Learning via Offline Greedy Algorithms: Applications in Market
Design and Optimization
- Authors: Rad Niazadeh (1), Negin Golrezaei (2), Joshua Wang (3), Fransisca
Susan (2), Ashwinkumar Badanidiyuru (3) ((1) Chicago Booth School of
Business, Operations Management, (2) MIT Sloan School of Management,
Operations Management, (3) Google Research Mountain View)
- Abstract summary: We focus on offline problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors.
For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones.
We show that the resulting online algorithms have $O(sqrtT)$ (approximate) regret under the full information setting.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by online decision-making in time-varying combinatorial
environments, we study the problem of transforming offline algorithms to their
online counterparts. We focus on offline combinatorial problems that are
amenable to a constant factor approximation using a greedy algorithm that is
robust to local errors. For such problems, we provide a general framework that
efficiently transforms offline robust greedy algorithms to online ones using
Blackwell approachability. We show that the resulting online algorithms have
$O(\sqrt{T})$ (approximate) regret under the full information setting. We
further introduce a bandit extension of Blackwell approachability that we call
Bandit Blackwell approachability. We leverage this notion to transform greedy
robust offline algorithms into a $O(T^{2/3})$ (approximate) regret in the
bandit setting. Demonstrating the flexibility of our framework, we apply our
offline-to-online transformation to several problems at the intersection of
revenue management, market design, and online optimization, including product
ranking optimization in online platforms, reserve price optimization in
auctions, and submodular maximization. We show that our transformation, when
applied to these applications, leads to new regret bounds or improves the
current known bounds.
Related papers
- Online Conversion with Switching Costs: Robust and Learning-Augmented Algorithms [11.029788598491077]
We study online conversion with switching costs, a family of online problems that capture emerging problems at the intersection of energy and sustainability.
We introduce competitive (robust) threshold-based algorithms for both the deterministic and deterministic variants of this problem.
We then propose learning-augmented algorithms that take advantage of black-box advice to achieve significantly better average-case performance.
arXiv Detail & Related papers (2023-10-31T16:34:49Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - A Batch-to-Online Transformation under Random-Order Model [23.817140289575377]
We introduce a transformation framework that can be utilized to develop online algorithms with low $epsilon$-approximate regret.
We apply it to various problems, including online $(k,z)$-clustering, online matrix approximation, and online regression.
Our algorithm also enjoys low inconsistency, which may be desired in some online applications.
arXiv Detail & Related papers (2023-06-12T14:50:21Z) - Online and Streaming Algorithms for Constrained $k$-Submodular
Maximization [25.15934533974176]
Constrained $k$-submodular is a general framework that captures many discrete optimization problems such as ad allocation, influence, personalized recommendation, and many others.
In this work, we develop single-pass streaming and online algorithms for $k$-submodular constrained with both monotone and general (possibly non-monotone) objectives.
arXiv Detail & Related papers (2023-05-25T12:53:17Z) - Learning-Assisted Algorithm Unrolling for Online Optimization with
Budget Constraints [27.84415856657607]
We propose a new machine learning (ML) assisted unrolling approach, called LAAU (Learning-Assisted Algorithm Unrolling)
For efficient training via backpropagation, we derive gradients of the decision pipeline over time.
We also provide the average cost bounds for two cases when training data is available offline and collected online, respectively.
arXiv Detail & Related papers (2022-12-03T20:56:29Z) - Optimal Parameter-free Online Learning with Switching Cost [47.415099037249085]
-freeness in online learning refers to the adaptivity of an algorithm with respect to the optimal decision in hindsight.
In this paper, we design such algorithms in the presence of switching cost - the latter penalizes the optimistic updates required by parameter-freeness.
We propose a simple yet powerful algorithm for Online Linear Optimization (OLO) with switching cost, which improves the existing suboptimal regret bound [ZCP22a] to the optimal rate.
arXiv Detail & Related papers (2022-05-13T18:44:27Z) - Safe Online Bid Optimization with Return-On-Investment and Budget
Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems.
We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations.
More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z) - Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities.
Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z) - Boosting for Online Convex Optimization [64.15578413206715]
We consider the decision-making framework of online convex optimization with a large number of experts.
We define a weak learning algorithm as a mechanism that guarantees approximate regret against a base class of experts.
We give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class.
arXiv Detail & Related papers (2021-02-18T12:30:49Z) - Online Markov Decision Processes with Aggregate Bandit Feedback [74.85532145498742]
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics.
In each episode, the learner suffers the loss accumulated along the trajectory realized by the policy chosen for the episode, and observes aggregate bandit feedback.
Our main result is a computationally efficient algorithm with $O(sqrtK)$ regret for this setting, where $K$ is the number of episodes.
arXiv Detail & Related papers (2021-01-31T16:49:07Z) - Online Optimization with Memory and Competitive Control [43.974996526016305]
This paper presents competitive algorithms for a class of online optimization problems with memory.
We show a connection between online optimization with memory and online control with adversarial disturbances.
arXiv Detail & Related papers (2020-02-13T02:58:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.