Multiple Mean-Payoff Optimization under Local Stability Constraints
- URL: http://arxiv.org/abs/2412.13369v1
- Date: Tue, 17 Dec 2024 22:53:08 GMT
- Title: Multiple Mean-Payoff Optimization under Local Stability Constraints
- Authors: David Klaška, Antonín Kučera, Vojtěch Kůr, Vít Musil, Vojtěch Řehák,
- Abstract summary: The long-run average payoff per transition (mean payoff) is the main tool for specifying the performance and dependability properties of discrete systems.
The problem of constructing a controller (strategy) simultaneously optimizing several mean payoffs has been deeply studied for and game-theoretic models.
In this paper, we design and evaluate the first efficient and scalable solution to this problem applicable to Markov decision processes.
- Score: 2.95348334737984
- License:
- Abstract: The long-run average payoff per transition (mean payoff) is the main tool for specifying the performance and dependability properties of discrete systems. The problem of constructing a controller (strategy) simultaneously optimizing several mean payoffs has been deeply studied for stochastic and game-theoretic models. One common issue of the constructed controllers is the instability of the mean payoffs, measured by the deviations of the average rewards per transition computed in a finite "window" sliding along a run. Unfortunately, the problem of simultaneously optimizing the mean payoffs under local stability constraints is computationally hard, and the existing works do not provide a practically usable algorithm even for non-stochastic models such as two-player games. In this paper, we design and evaluate the first efficient and scalable solution to this problem applicable to Markov decision processes.
Related papers
- DCatalyst: A Unified Accelerated Framework for Decentralized Optimization [10.925931212031692]
We study decentralized optimization over a network of agents, modeled as graphs, with no central server.
We introduce DCatalyst, a unified black-box framework that integrates Nesterov acceleration into decentralized optimization algorithms.
arXiv Detail & Related papers (2025-01-30T03:32:59Z) - Bayesian Optimization for Non-Convex Two-Stage Stochastic Optimization Problems [2.9016548477524156]
We formulate a knowledge-gradient-based acquisition function to jointly optimize the first variables, establish a guarantee of consistency, and provide an approximation.
We demonstrate comparable empirical results to an alternative we formulate with fewer focuss, which alternates between the two variable types.
arXiv Detail & Related papers (2024-08-30T16:26:31Z) - Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization [9.449153668916098]
We study the smoothed online optimization (SOQO) problem where, at each round $t$, a player plays an action $x_t in response to a quadratic hitting cost and an additional squared $ell$-norm cost for switching actions.
This problem class has strong connections to a wide range of application domains including smart grid management, adaptive control, and data center management.
We present a best-of-both-worlds algorithm that obtains a robust adversarial performance while simultaneously achieving a near-optimal performance.
arXiv Detail & Related papers (2023-10-31T22:59:23Z) - Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms.
Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Gradient-Free Methods for Deterministic and Stochastic Nonsmooth
Nonconvex Optimization [94.19177623349947]
Non-smooth non optimization problems emerge in machine learning and business making.
Two core challenges impede the development of efficient methods with finitetime convergence guarantee.
Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results.
arXiv Detail & Related papers (2022-09-12T06:53:24Z) - T*$\varepsilon$ -- Bounded-Suboptimal Efficient Motion Planning for
Minimum-Time Planar Curvature-Constrained Systems [7.277760003553328]
We consider the problem of finding collision-free paths for curvature-constrained systems in the presence of obstacles.
We show that by finding bounded-suboptimal solutions, one can dramatically reduce the number of time-optimal transitions used.
arXiv Detail & Related papers (2022-04-04T17:38:36Z) - A unified algorithm framework for mean-variance optimization in
discounted Markov decision processes [7.510742715895749]
This paper studies the risk-averse mean-variance optimization in infinite-horizon discounted Markov decision processes (MDPs)
We introduce a pseudo mean to transform the untreatable MDP to a standard one with a redefined reward function in standard form.
We propose a unified algorithm framework with a bilevel optimization structure for the discounted mean-variance optimization.
arXiv Detail & Related papers (2022-01-15T02:19:56Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization.
Our algorithm provably obtains sub-linear robust regret in various settings.
We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.