Related papers: Single-Sample and Robust Online Resource Allocation

Single-Sample and Robust Online Resource Allocation

URL: http://arxiv.org/abs/2505.02963v1
Date: Mon, 05 May 2025 18:48:11 GMT
Title: Single-Sample and Robust Online Resource Allocation
Authors: Rohan Ghuge, Sahil Singla, Yifan Wang,
Abstract summary: We present a novel Exponential Pricing algorithm for online resource allocation.<n>It is robust to corruptions in the outliers model and the value augmentation model.<n>It operates from conventional approaches that use an online learning algorithm for item pricing.
Score: 14.956072215503797
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online Resource Allocation problem is a central problem in many areas of Computer Science, Operations Research, and Economics. In this problem, we sequentially receive $n$ stochastic requests for $m$ kinds of shared resources, where each request can be satisfied in multiple ways, consuming different amounts of resources and generating different values. The goal is to achieve a $(1-\epsilon)$-approximation to the hindsight optimum, where $\epsilon>0$ is a small constant, assuming each resource has a large budget. In this paper, we investigate the learnability and robustness of online resource allocation. Our primary contribution is a novel Exponential Pricing algorithm with the following properties: 1. It requires only a \emph{single sample} from each of the $n$ request distributions to achieve a $(1-\epsilon)$-approximation for online resource allocation with large budgets. Such an algorithm was previously unknown, even with access to polynomially many samples, as prior work either assumed full distributional knowledge or was limited to i.i.d.\,or random-order arrivals. 2. It is robust to corruptions in the outliers model and the value augmentation model. Specifically, it maintains its $(1 - \epsilon)$-approximation guarantee under both these robustness models, resolving the open question posed in Argue, Gupta, Molinaro, and Singla (SODA'22). 3. It operates as a simple item-pricing algorithm that ensures incentive compatibility. The intuition behind our Exponential Pricing algorithm is that the price of a resource should adjust exponentially as it is overused or underused. It differs from conventional approaches that use an online learning algorithm for item pricing. This departure guarantees that the algorithm will never run out of any resource, but loses the usual no-regret properties of online learning algorithms, necessitating a new analytical approach.

Related papers

Online Learning with Probing for Sequential User-Centric Selection [8.45399786458738]
We present a probing-augmented user-centric selection (PUCS) framework, where a learner first probes a subset of arms to obtain side information on resources and rewards, and then assigns $K$ plays to $M$ arms.<n>For the offline setting with known distributions, we present a greedy probing algorithm with a constant-factor approximation guarantee $zeta = (e-1)/ (2e-1)$.<n>For the online setting with unknown distributions, we introduce OLPA, a bandit algorithm that a regret a bound $mathcalO(sqrtT +
arXiv Detail & Related papers (2025-07-27T03:32:51Z)
Demand Balancing in Primal-Dual Optimization for Blind Network Revenue Management [6.72809363581332]
This paper proposes a practically efficient algorithm with optimal theoretical regret which solves the classical network revenue management problem with unknown, nonparametric demand. A key technical contribution is the so-called demand balancing, which pairs the primal solution (i.e., the price) in each time period with another price to offset the violation of slackness on resource inventory constraints.
arXiv Detail & Related papers (2024-04-06T01:39:51Z)
Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards [40.99322897009357]
We propose two novel algorithms based on truncation and mean of medians. Our truncation-based algorithm supports online learning, distinguishing it from existing truncation-based approaches. Our algorithms improve the regret bounds by a logarithmic factor compared to existing algorithms when $epsilon=1$.
arXiv Detail & Related papers (2023-10-28T13:01:10Z)
Online Resource Allocation in Episodic Markov Decision Processes [1.8416014644193066]
We formulate the problem as an online allocation problem in an episodic finite-horizon constrained Markov decision process. We propose the observe-then-decide regime and improve the existing decide-then-observe regime. We show that the regret against the static optimal policy that has access to the mean reward and mean resource consumption functions is bounded by $tilde O(rho-1H3/2SsqrtAT)$ with high probability.
arXiv Detail & Related papers (2023-05-18T06:28:34Z)
Online Learning with Adversaries: A Differential-Inclusion Analysis [52.43460995467893]
We introduce an observation-matrix-based framework for fully asynchronous online Federated Learning with adversaries. Our main result is that the proposed algorithm almost surely converges to the desired mean $mu.$ We derive this convergence using a novel differential-inclusion-based two-timescale analysis.
arXiv Detail & Related papers (2023-04-04T04:32:29Z)
Quantum Resources Required to Block-Encode a Matrix of Classical Data [56.508135743727934]
We provide circuit-level implementations and resource estimates for several methods of block-encoding a dense $Ntimes N$ matrix of classical data to precision $epsilon$. We examine resource tradeoffs between the different approaches and explore implementations of two separate models of quantum random access memory (QRAM) Our results go beyond simple query complexity and provide a clear picture into the resource costs when large amounts of classical data are assumed to be accessible to quantum algorithms.
arXiv Detail & Related papers (2022-06-07T18:00:01Z)
Online Allocation with Two-sided Resource Constraints [44.5635910908944]
We consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially. We propose a new algorithm that obtains $1-O(fracepsilonalpha-epsilon)$ -competitive ratio for the offline problems that know the entire requests ahead of time.
arXiv Detail & Related papers (2021-12-28T02:21:06Z)
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function. We tackle this problem under the context of function approximation, leveraging powerful function approximators. We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z)
Online Apprenticeship Learning [58.45089581278177]
In Apprenticeship Learning (AL), we are given a Markov Decision Process (MDP) without access to the cost function. The goal is to find a policy that matches the expert's performance on some predefined set of cost functions. We show that the OAL problem can be effectively solved by combining two mirror descent based no-regret algorithms.
arXiv Detail & Related papers (2021-02-13T12:57:51Z)
Empirical Risk Minimization in the Non-interactive Local Model of Differential Privacy [26.69391745812235]
We study the Empirical Risk Minimization (ERM) problem in the noninteractive Local Differential Privacy (LDP) model. Previous research indicates that the sample complexity, to achieve error $alpha, needs to be depending on dimensionality $p$ for general loss functions.
arXiv Detail & Related papers (2020-11-11T17:48:00Z)
A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise [55.45544638410858]
We study the problem of PAC learning homogeneous halfspaces in the presence of Tsybakov noise. Our algorithm learns the true halfspace within any accuracy $epsilon$.
arXiv Detail & Related papers (2020-10-04T22:19:06Z)
Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise. We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution. We show that the constraint of local differential privacy incurs an exponential increase in cost. Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.