Related papers: Offline Model-Based Optimization via Policy-Guided Gradient Search

Offline Model-Based Optimization via Policy-Guided Gradient Search

URL: http://arxiv.org/abs/2405.05349v1
Date: Wed, 8 May 2024 18:27:37 GMT
Title: Offline Model-Based Optimization via Policy-Guided Gradient Search
Authors: Yassine Chemingui, Aryan Deshwal, Trong Nghia Hoang, Janardhan Rao Doppa,
Abstract summary: We introduce a new learning-to-search- gradient perspective for offline optimization by reformulating it as an offline reinforcement learning problem. Our proposed policy-guided search approach explicitly learns the best policy for a given surrogate model created from the offline data.
Score: 30.87992788876113
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Offline optimization is an emerging problem in many experimental engineering domains including protein, drug or aircraft design, where online experimentation to collect evaluation data is too expensive or dangerous. To avoid that, one has to optimize an unknown function given only its offline evaluation at a fixed set of inputs. A naive solution to this problem is to learn a surrogate model of the unknown function and optimize this surrogate instead. However, such a naive optimizer is prone to erroneous overestimation of the surrogate (possibly due to over-fitting on a biased sample of function evaluation) on inputs outside the offline dataset. Prior approaches addressing this challenge have primarily focused on learning robust surrogate models. However, their search strategies are derived from the surrogate model rather than the actual offline data. To fill this important gap, we introduce a new learning-to-search perspective for offline optimization by reformulating it as an offline reinforcement learning problem. Our proposed policy-guided gradient search approach explicitly learns the best policy for a given surrogate model created from the offline data. Our empirical results on multiple benchmarks demonstrate that the learned optimization policy can be combined with existing offline surrogates to significantly improve the optimization performance.

Related papers

Offline Model-Based Optimization: Comprehensive Review [61.91350077539443]
offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. Recent advances in model-based optimization have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review.
arXiv Detail & Related papers (2025-03-21T16:35:02Z)
Incorporating Surrogate Gradient Norm to Improve Offline Optimization Techniques [8.750390242872138]
We develop a model-agnostic approach to offline optimization. We show that reducing surrogate sharpness on the offline dataset provably reduces its generalized sharpness on unseen data. Our analysis extends existing theories from bounding generalized prediction loss (on unseen data) with loss sharpness to bounding the worst-case generalized surrogate sharpness with its empirical estimate on training data.
arXiv Detail & Related papers (2025-03-06T09:24:23Z)
Boosting Offline Optimizers with Surrogate Sensitivity [8.750390242872138]
offline optimization is important in numerous material engineering domains where online experimentation is too expensive and needs to be replaced by an in silico surrogate of the black-box function. Although such a surrogate can be learned from offline data, its prediction might not be reliable outside the offline data regime. This raises the following questions: (1) how to regulate the sensitivity of a surrogate model; and (2) whether an offlineinformed with such a surrogate will lead to better performance.
arXiv Detail & Related papers (2025-03-06T07:50:32Z)
Learning Surrogates for Offline Black-Box Optimization via Gradient Matching [29.831762991506242]
We present a theoretical framework to understand offline black-box optimization, by explicitly bounding the optimization quality based on how well the surrogate matches the latent gradient field that underlines the offline data. Inspired by our principled analysis, we propose a black-box gradient matching algorithm to create effective surrogate models for offline optimization, improving over prior approaches on various real-world benchmarks.
arXiv Detail & Related papers (2025-02-26T22:35:54Z)
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization [78.82586283794886]
We present a new offline alignment algorithm, $chi2$-Preference Optimization ($chi$PO) $chi$PO implements the principle of pessimism in the face of uncertainty via regularization. It is provably robust to overoptimization and achieves sample-complexity guarantees based on single-policy concentrability.
arXiv Detail & Related papers (2024-07-18T11:08:40Z)
Offline Multi-Objective Optimization [23.543056729281695]
offline optimization aims to maximize a black-box objective function with a static dataset and has wide applications. We propose a first benchmark for offline MOO, covering a range of problems from synthetic to real-world tasks. Empirical results show improvements over the best value of the training set, demonstrating the effectiveness of offline MOO methods.
arXiv Detail & Related papers (2024-06-06T03:35:09Z)
Bayesian Design Principles for Offline-to-Online Reinforcement Learning [50.97583504192167]
offline-to-online fine-tuning is crucial for real-world applications where exploration can be costly or unsafe. In this paper, we tackle the dilemma of offline-to-online fine-tuning: if the agent remains pessimistic, it may fail to learn a better policy, while if it becomes optimistic directly, performance may suffer from a sudden drop. We show that Bayesian design principles are crucial in solving such a dilemma.
arXiv Detail & Related papers (2024-05-31T16:31:07Z)
From Function to Distribution Modeling: A PAC-Generative Approach to Offline Optimization [30.689032197123755]
This paper considers the problem of offline optimization, where the objective function is unknown except for a collection of offline" data examples. Instead of learning and then optimizing the unknown objective function, we take on a less intuitive but more direct view that optimization can be thought of as a process of sampling from a generative model.
arXiv Detail & Related papers (2024-01-04T01:32:50Z)
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization. We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z)
PASTA: Pessimistic Assortment Optimization [25.51792135903357]
We consider a class of assortment optimization problems in an offline data-driven setting. We propose an algorithm referred to as Pessimistic ASsortment opTimizAtion (PASTA) based on the principle of pessimism.
arXiv Detail & Related papers (2023-02-08T01:11:51Z)
Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online. We extensively ablate these design choices, demonstrating the key factors that most affect performance. We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z)
Data-Driven Offline Decision-Making via Invariant Representation Learning [97.49309949598505]
offline data-driven decision-making involves synthesizing optimized decisions with no active interaction. A key challenge is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good. In this paper, we formulate offline data-driven decision-making as domain adaptation, where the goal is to make accurate predictions for the value of optimized decisions.
arXiv Detail & Related papers (2022-11-21T11:01:37Z)
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation [59.469401906712555]
We present an offline reinforcement learning algorithm that prevents overestimation in a more principled way. Our algorithm, OptiDICE, directly estimates the stationary distribution corrections of the optimal policy. We show that OptiDICE performs competitively with the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-21T00:43:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.