Adaptive Decision-Making with Constraints and Dependent Losses:
Performance Guarantees and Applications to Online and Nonlinear
Identification
- URL: http://arxiv.org/abs/2304.03321v1
- Date: Thu, 6 Apr 2023 18:32:26 GMT
- Title: Adaptive Decision-Making with Constraints and Dependent Losses:
Performance Guarantees and Applications to Online and Nonlinear
Identification
- Authors: Michael Muehlebach
- Abstract summary: We consider adaptive decision-making problems where an agent optimize a cumulative performance objective by repeatedly choosing among a finite set of options.
Our algorithm and our analysis is instance dependent, that is, suboptimal choices of the environment are exploited and reflected in our regret bounds.
The performance of the resulting algorithms is highlighted in two numerical examples.
- Score: 5.787117733071415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider adaptive decision-making problems where an agent optimizes a
cumulative performance objective by repeatedly choosing among a finite set of
options. Compared to the classical prediction-with-expert-advice set-up, we
consider situations where losses are constrained and derive algorithms that
exploit the additional structure in optimal and computationally efficient ways.
Our algorithm and our analysis is instance dependent, that is, suboptimal
choices of the environment are exploited and reflected in our regret bounds.
The constraints handle general dependencies between losses (even across time),
and are flexible enough to also account for a loss budget, which the
environment is not allowed to exceed. The performance of the resulting
algorithms is highlighted in two numerical examples, which include a nonlinear
and online system identification task.
Related papers
- Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Do We Really Need to Use Constraint Violation in Constrained
Evolutionary Multi-Objective Optimization? [13.833668582211876]
Constraint violation has been a building block to design evolutionary multi-objective optimization algorithms.
This paper develops the corresponding variants that replace the constraint violation by a crisp value.
From our experiments on both synthetic and real-world benchmark test problems, we find that the performance of the selected algorithms have not been significantly influenced.
arXiv Detail & Related papers (2022-05-28T06:29:07Z) - Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response.
We construct unbiased estimators for the policy-dependent estimand by a perturbation method.
We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z) - Algorithmic Challenges in Ensuring Fairness at the Time of Decision [6.228560624452748]
Algorithmic decision-making in societal contexts can be framed as optimization under bandit feedback.
Recent lawsuits accuse companies that deploy algorithmic pricing practices of price gouging.
We introduce the well-studied fairness notion of envy-freeness within the context of convex optimization.
arXiv Detail & Related papers (2021-03-16T19:06:28Z) - Advancing Trajectory Optimization with Approximate Inference:
Exploration, Covariance Control and Adaptive Risk [29.811633555275666]
We look at the input inference for control (i2c) algorithm, and derive three key characteristics that enable advanced trajectory optimization.
An expert' linear Gaussian controller that combines the benefits of open-loop optima and closed-loop variance reduction when optimizing for nonlinear systems.
arXiv Detail & Related papers (2021-03-10T19:52:31Z) - Online Optimization and Ambiguity-based Learning of Distributionally Uncertain Dynamic Systems [1.6709415233613623]
This paper proposes a novel approach to construct data-driven online solutions to optimization problems (P) subject to a class of distributionally uncertain dynamical systems.
The introduced framework allows for the simultaneous learning of distributional system uncertainty via a parameterized, control-dependent ambiguity set.
We also introduce an online version of Nesterov's accelerated-gradient algorithm, and analyze its performance to solve this class of problems via dissipativity theory.
arXiv Detail & Related papers (2021-02-18T01:49:06Z) - Optimizing Wireless Systems Using Unsupervised and
Reinforced-Unsupervised Deep Learning [96.01176486957226]
Resource allocation and transceivers in wireless networks are usually designed by solving optimization problems.
In this article, we introduce unsupervised and reinforced-unsupervised learning frameworks for solving both variable and functional optimization problems.
arXiv Detail & Related papers (2020-01-03T11:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.