End-to-End Stochastic Optimization with Energy-Based Model
- URL: http://arxiv.org/abs/2211.13837v1
- Date: Fri, 25 Nov 2022 00:14:12 GMT
- Title: End-to-End Stochastic Optimization with Energy-Based Model
- Authors: Lingkai Kong, Jiaming Cui, Yuchen Zhuang, Rui Feng, B. Aditya Prakash,
Chao Zhang
- Abstract summary: Decision-focused learning (DFL) was recently proposed for objective optimization problems that involve unknown parameters.
We propose SO-EBM, a general and efficient DFL method for layer optimization using energy-based models.
- Score: 18.60842637575249
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Decision-focused learning (DFL) was recently proposed for stochastic
optimization problems that involve unknown parameters. By integrating
predictive modeling with an implicitly differentiable optimization layer, DFL
has shown superior performance to the standard two-stage predict-then-optimize
pipeline. However, most existing DFL methods are only applicable to convex
problems or a subset of nonconvex problems that can be easily relaxed to convex
ones. Further, they can be inefficient in training due to the requirement of
solving and differentiating through the optimization problem in every training
iteration. We propose SO-EBM, a general and efficient DFL method for stochastic
optimization using energy-based models. Instead of relying on KKT conditions to
induce an implicit optimization layer, SO-EBM explicitly parameterizes the
original optimization problem using a differentiable optimization layer based
on energy functions. To better approximate the optimization landscape, we
propose a coupled training objective that uses a maximum likelihood loss to
capture the optimum location and a distribution-based regularizer to capture
the overall energy landscape. Finally, we propose an efficient training
procedure for SO-EBM with a self-normalized importance sampler based on a
Gaussian mixture proposal. We evaluate SO-EBM in three applications: power
scheduling, COVID-19 resource allocation, and non-convex adversarial security
game, demonstrating the effectiveness and efficiency of SO-EBM.
Related papers
- A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models [24.185245582500876]
We introduce FISTAPruner, the first post-training pruner based on convex optimization models and algorithms.
FISTAPruner incorporates an intra-layer cumulative error correction mechanism and supports parallel pruning.
We evaluate FISTAPruner on models such as OPT, LLaMA, LLaMA-2, and LLaMA-3 with 125M to 70B parameters under unstructured and 2:4 semi-structured sparsity.
arXiv Detail & Related papers (2024-08-07T12:33:46Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Multi-fidelity Constrained Optimization for Stochastic Black Box
Simulators [1.6385815610837167]
We introduce the algorithm Scout-Nd (Stochastic Constrained Optimization for N dimensions) to tackle the issues mentioned earlier.
Scout-Nd efficiently estimates the gradient, reduces the noise of the estimator gradient, and applies multi-fidelity schemes to further reduce computational effort.
We validate our approach on standard benchmarks, demonstrating its effectiveness in optimizing parameters highlighting better performance compared to existing methods.
arXiv Detail & Related papers (2023-11-25T23:36:38Z) - Parallel Bayesian Optimization Using Satisficing Thompson Sampling for
Time-Sensitive Black-Box Optimization [0.0]
We propose satisficing Thompson sampling-based parallel BO approaches, including synchronous and asynchronous versions.
We shift the target from an optimal solution to a satisficing solution that is easier to learn.
The effectiveness of the proposed methods is demonstrated on a fast-charging design problem of Lithium-ion batteries.
arXiv Detail & Related papers (2023-10-19T07:03:51Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Learning Implicit Priors for Motion Optimization [105.11889448885226]
Energy-based Models (EBM) represent expressive probability density distributions.
We present a set of required modeling and algorithmic choices to adapt EBMs into motion optimization.
arXiv Detail & Related papers (2022-04-11T19:14:54Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - BOSH: Bayesian Optimization by Sampling Hierarchically [10.10241176664951]
We propose a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations.
We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper- parameter tuning tasks.
arXiv Detail & Related papers (2020-07-02T07:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.