Online Stochastic Optimization with Wasserstein Based Non-stationarity
- URL: http://arxiv.org/abs/2012.06961v2
- Date: Wed, 23 Dec 2020 06:24:13 GMT
- Title: Online Stochastic Optimization with Wasserstein Based Non-stationarity
- Authors: Jiashuo Jiang, Xiaocheng Li, Jiawei Zhang
- Abstract summary: We consider a general online optimization problem with multiple budget constraints over a horizon of finite time periods.
The objective of the decision maker is to maximize the cumulative reward subject to the budget constraints.
This formulation captures a wide range of applications including online linear programming and network revenue management.
- Score: 12.91020811577007
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a general online stochastic optimization problem with multiple
budget constraints over a horizon of finite time periods. In each time period,
a reward function and multiple cost functions are revealed, and the decision
maker needs to specify an action from a convex and compact action set to
collect the reward and consume the budget. Each cost function corresponds to
the consumption of one budget. In each period, the reward and cost functions
are drawn from an unknown distribution, which is non-stationary across time.
The objective of the decision maker is to maximize the cumulative reward
subject to the budget constraints. This formulation captures a wide range of
applications including online linear programming and network revenue
management, among others. In this paper, we consider two settings: (i) a
data-driven setting where the true distribution is unknown but a prior estimate
(possibly inaccurate) is available; (ii) an uninformative setting where the
true distribution is completely unknown. We propose a unified
Wasserstein-distance based measure to quantify the inaccuracy of the prior
estimate in setting (i) and the non-stationarity of the system in setting (ii).
We show that the proposed measure leads to a necessary and sufficient condition
for the attainability of a sublinear regret in both settings. For setting (i),
we propose a new algorithm, which takes a primal-dual perspective and
integrates the prior information of the underlying distributions into an online
gradient descent procedure in the dual space. The algorithm also naturally
extends to the uninformative setting (ii). Under both settings, we show the
corresponding algorithm achieves a regret of optimal order. In numerical
experiments, we demonstrate how the proposed algorithms can be naturally
integrated with the re-solving technique to further boost the empirical
performance.
Related papers
- Minimax and Communication-Efficient Distributed Best Subset Selection with Oracle Property [0.358439716487063]
The explosion of large-scale data has outstripped the processing capabilities of single-machine systems.
Traditional approaches to distributed inference often struggle with achieving true sparsity in high-dimensional datasets.
We propose a novel, two-stage, distributed best subset selection algorithm to address these issues.
arXiv Detail & Related papers (2024-08-30T13:22:08Z) - Online Fair Allocation of Perishable Resources [1.4952056744888913]
We consider a practically motivated variant of the canonical online fair allocation problem.
A decision-maker has a budget of perishable resources to allocate over a fixed number of rounds.
The goal is to construct a sequence of allocations that is envy-free and efficient.
arXiv Detail & Related papers (2024-06-04T15:14:10Z) - Constrained Online Two-stage Stochastic Optimization: Algorithm with
(and without) Predictions [19.537289123577022]
We consider an online two-stage optimization with long-term constraints over a finite horizon of $T$ periods.
We develop online algorithms for the online two-stage problem from adversarial learning algorithms.
arXiv Detail & Related papers (2024-01-02T07:46:33Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
Processes [80.89852729380425]
We propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $tilde O(dsqrtH3K)$.
Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
arXiv Detail & Related papers (2022-12-12T18:58:59Z) - Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time
Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy.
Many algorithms for IRL have an inherently nested structure.
We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z) - Online Allocation with Two-sided Resource Constraints [44.5635910908944]
We consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially.
We propose a new algorithm that obtains $1-O(fracepsilonalpha-epsilon)$ -competitive ratio for the offline problems that know the entire requests ahead of time.
arXiv Detail & Related papers (2021-12-28T02:21:06Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - The Best of Many Worlds: Dual Mirror Descent for Online Allocation
Problems [7.433931244705934]
We consider a data-driven setting in which the reward and resource consumption of each request are generated using an input model unknown to the decision maker.
We design general class of algorithms that attain good performance in various input models without knowing which type of input they are facing.
Our algorithms operate in the Lagrangian dual space: they maintain a dual multiplier for each resource that is updated using online mirror descent.
arXiv Detail & Related papers (2020-11-18T18:39:17Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.