Black-Box Optimization with Implicit Constraints for Public Policy
- URL: http://arxiv.org/abs/2310.18449v5
- Date: Wed, 22 Jan 2025 05:29:12 GMT
- Title: Black-Box Optimization with Implicit Constraints for Public Policy
- Authors: Wenqian Xing, JungHo Lee, Chong Liu, Shixiang Zhu,
- Abstract summary: This paper introduces a novel BBO framework, termed as the Conditional And Generative Black-box Optimization (CageBO)<n>The CageBO efficiently handles the implicit constraints often found in public policy applications.<n>Our results reveal that our CageBO offers notable improvements in performance and efficiency compared to the baselines.
- Score: 7.905659620019301
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Black-box optimization (BBO) has become increasingly relevant for tackling complex decision-making problems, especially in public policy domains such as police redistricting. However, its broader application in public policymaking is hindered by the complexity of defining feasible regions and the high-dimensionality of decisions. This paper introduces a novel BBO framework, termed as the Conditional And Generative Black-box Optimization (CageBO). This approach leverages a conditional variational autoencoder to learn the distribution of feasible decisions, enabling a two-way mapping between the original decision space and a simplified, constraint-free latent space. The CageBO efficiently handles the implicit constraints often found in public policy applications, allowing for optimization in the latent space while evaluating objectives in the original space. We validate our method through a case study on large-scale police redistricting problems in Atlanta, Georgia. Our results reveal that our CageBO offers notable improvements in performance and efficiency compared to the baselines.
Related papers
- Feasibility-Driven Trust Region Bayesian Optimization [0.048748194765816946]
FuRBO iteratively defines a trust region from which the next candidate solution is selected.<n>We empirically demonstrate the effectiveness of FuRBO through extensive testing on the full BBOB-constrained benchmark suite.
arXiv Detail & Related papers (2025-06-17T15:16:22Z) - LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regions [0.0]
We propose the LABCAT algorithm, which extends trust-region-based BO.
We show that the algorithm outperforms several state-of-the-art BO and other black-box optimization algorithms.
arXiv Detail & Related papers (2023-11-19T13:56:24Z) - Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW)
We derive regret bounds for CBO-MW that naturally depend on graph-related quantities.
Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z) - Analysis of modular CMA-ES on strict box-constrained problems in the
SBOX-COST benchmarking suite [0.0]
Box-constraints limit the domain of decision variables and are common in real-world optimization problems.
Existing benchmark suites, such as COCO/BBOB, allow the evaluation of infeasible solutions.
This paper presents an initial study on the strict-box-constrained benchmarking suite (SBOX-COST)
We find that, contrary to what may be expected, handling box-constraints by saturation is not always better than not handling them.
arXiv Detail & Related papers (2023-05-24T12:37:03Z) - Trust-Region-Free Policy Optimization for Stochastic Policies [60.52463923712565]
We show that the trust region constraint over policies can be safely substituted by a trust-region-free constraint without compromising the underlying monotonic improvement guarantee.
We call the resulting algorithm Trust-REgion-Free Policy Optimization (TREFree) explicit as it is free of any trust region constraints.
arXiv Detail & Related papers (2023-02-15T23:10:06Z) - Trust Region Policy Optimization with Optimal Transport Discrepancies:
Duality and Algorithm for Continuous Actions [5.820284464296154]
Trust Region Policy Optimization is a popular approach to stabilize the policy updates.
We propose a novel algorithm - Optimal Transport Trust Region Policy Optimization (OT-TRPO) - for continuous state-action spaces.
Our results show that optimal transport discrepancies can offer an advantage over state-of-the-art approaches.
arXiv Detail & Related papers (2022-10-20T10:04:35Z) - Distributionally-Constrained Policy Optimization via Unbalanced Optimal
Transport [15.294456568539148]
We formulate policy optimization as unbalanced optimal transport over the space of occupancy measures.
We propose a general purpose RL objective based on Bregman divergence and optimize it using Dykstra's algorithm.
arXiv Detail & Related papers (2021-02-15T23:04:37Z) - Optimization Issues in KL-Constrained Approximate Policy Iteration [48.24321346619156]
Many reinforcement learning algorithms can be seen as versions of approximate policy iteration (API)
While standard API often performs poorly, it has been shown that learning can be stabilized by regularizing each policy update by the KL-divergence to the previous policy.
Popular practical algorithms such as TRPO, MPO, and VMPO replace regularization by a constraint on KL-divergence of consecutive policies.
arXiv Detail & Related papers (2021-02-11T19:35:33Z) - Adaptive Discretization for Adversarial Lipschitz Bandits [85.39106976861702]
Lipschitz bandits is a prominent version of multi-armed bandits that studies large, structured action spaces.
A central theme here is the adaptive discretization of the action space, which gradually zooms in'' on the more promising regions.
We provide the first algorithm for adaptive discretization in the adversarial version, and derive instance-dependent regret bounds.
arXiv Detail & Related papers (2020-06-22T16:06:25Z) - Deep Reinforcement Learning with Robust and Smooth Policy [90.78795857181727]
We propose to learn a smooth policy that behaves smoothly with respect to states.
We develop a new framework -- textbfSmooth textbfRegularized textbfReinforcement textbfLearning ($textbfSR2textbfL$), where the policy is trained with smoothness-inducing regularization.
Such regularization effectively constrains the search space, and enforces smoothness in the learned policy.
arXiv Detail & Related papers (2020-03-21T00:10:29Z) - Public Bayesian Persuasion: Being Almost Optimal and Almost Persuasive [57.47546090379434]
We study the public persuasion problem in the general setting with: (i) arbitrary state spaces; (ii) arbitrary action spaces; (iii) arbitrary sender's utility functions.
We provide a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, in specific settings, yields a QPTAS.
arXiv Detail & Related papers (2020-02-12T18:59:18Z) - Improved Algorithms for Conservative Exploration in Bandits [113.55554483194832]
We study the conservative learning problem in the contextual linear bandit setting and introduce a novel algorithm, the Conservative Constrained LinUCB (CLUCB2)
We derive regret bounds for CLUCB2 that match existing results and empirically show that it outperforms state-of-the-art conservative bandit algorithms in a number of synthetic and real-world problems.
arXiv Detail & Related papers (2020-02-08T19:35:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.