Related papers: Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes

Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes

URL: http://arxiv.org/abs/2203.11382v1
Date: Mon, 21 Mar 2022 23:02:50 GMT
Title: Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes
Authors: Zhiyuan Jerry Lin, Raul Astudillo, Peter I. Frazier, Eytan Bakshy
Abstract summary: We consider optimization of experiments that generate vector-valued outcomes over which a decision-maker has preferences. These preferences are encoded by a utility function that is not known in closed form but can be estimated by asking the DM to express preferences over pairs of outcome vectors. We develop a novel framework that alternates between interactive real-time preference learning with the DM.
Score: 17.300690315775572
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider Bayesian optimization of expensive-to-evaluate experiments that generate vector-valued outcomes over which a decision-maker (DM) has preferences. These preferences are encoded by a utility function that is not known in closed form but can be estimated by asking the DM to express preferences over pairs of outcome vectors. To address this problem, we develop Bayesian optimization with preference exploration, a novel framework that alternates between interactive real-time preference learning with the DM via pairwise comparisons between outcomes, and Bayesian optimization with a learned compositional model of DM utility and outcomes. Within this framework, we propose preference exploration strategies specifically designed for this task, and demonstrate their performance via extensive simulation studies.

Related papers

Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Preference-Guided Diffusion for Multi-Objective Offline Optimization [64.08326521234228]
We propose a preference-guided diffusion model for offline multi-objective optimization. Our guidance is a preference model trained to predict the probability that one design dominates another. Our results highlight the effectiveness of classifier-guided diffusion models in generating diverse and high-quality solutions.
arXiv Detail & Related papers (2025-03-21T16:49:38Z)
Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment [74.25832963097658]
Multi-Objective Alignment (MOA) aims to align responses with multiple human preference objectives. We find that DPO-based MOA approaches suffer from widespread preference conflicts in the data.
arXiv Detail & Related papers (2025-02-20T08:27:00Z)
Calibrated Multi-Preference Optimization for Aligning Diffusion Models [92.90660301195396]
Calibrated Preference Optimization (CaPO) is a novel method to align text-to-image (T2I) diffusion models.<n>CaPO incorporates the general preference from multiple reward models without human annotated data.<n> Experimental results show that CaPO consistently outperforms prior methods.
arXiv Detail & Related papers (2025-02-04T18:59:23Z)
Vector Optimization with Gaussian Process Bandits [7.049738935364297]
Learning problems in which multiple objectives must be considered simultaneously often arise in various fields, including engineering, drug design, and environmental management.<n>Traditional methods for dealing with multiple black-box objective functions have limitations in incorporating objective preferences and exploring the solution space accordingly.<n>We propose Vector Optimization with Gaussian Process (VOGP), a probably approximately correct adaptive elimination algorithm that performs black-box vector optimization using Gaussian process bandits.
arXiv Detail & Related papers (2024-12-03T14:47:46Z)
Batched Bayesian optimization with correlated candidate uncertainties [44.38372821900645]
We propose an acquisition strategy for discrete optimization motivated by pure exploitation, qPO (multipoint of Optimality) We apply our method to the model-guided exploration of large chemical libraries and provide empirical evidence that it performs better than or on par with state-of-the-art methods in batched Bayesian optimization.
arXiv Detail & Related papers (2024-10-08T20:13:12Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization [75.1240295759264]
We propose an effective framework for Bridging and Modeling Correlations in pairwise data, named BMC. We increase the consistency and informativeness of the pairwise preference signals through targeted modifications. We identify that DPO alone is insufficient to model these correlations and capture nuanced variations.
arXiv Detail & Related papers (2024-08-14T11:29:47Z)
Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values. We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO) Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z)
Enhanced Bayesian Optimization via Preferential Modeling of Abstract Properties [49.351577714596544]
We propose a human-AI collaborative Bayesian framework to incorporate expert preferences about unmeasured abstract properties into surrogate modeling. We provide an efficient strategy that can also handle any incorrect/misleading expert bias in preferential judgments.
arXiv Detail & Related papers (2024-02-27T09:23:13Z)
Data-Efficient Interactive Multi-Objective Optimization Using ParEGO [6.042269506496206]
Multi-objective optimization seeks to identify a set of non-dominated solutions that provide optimal trade-offs among competing objectives. In practical applications, decision-makers (DMs) will select a single solution that aligns with their preferences to be implemented. We propose two novel algorithms that efficiently locate the most preferred region of the Pareto front in expensive-to-evaluate problems.
arXiv Detail & Related papers (2024-01-12T15:55:51Z)
Multi-Objective Bayesian Optimization with Active Preference Learning [18.066263838953223]
We propose a Bayesian optimization (BO) approach to identifying the most preferred solution in a multi-objective optimization (MOO) problem. To minimize the interaction cost with the decision maker (DM), we also propose an active learning strategy for the preference estimation.
arXiv Detail & Related papers (2023-11-22T15:24:36Z)
Differentiable Multi-Target Causal Bayesian Experimental Design [43.76697029708785]
We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting. Existing methods rely on greedy approximations to construct a batch of experiments. We propose a conceptually simple end-to-end gradient-based optimization procedure to acquire a set of optimal intervention target-state pairs.
arXiv Detail & Related papers (2023-02-21T11:32:59Z)
Optimal Bayesian experimental design for subsurface flow problems [77.34726150561087]
We propose a novel approach for development of chaos expansion (PCE) surrogate model for the design utility function. This novel technique enables the derivation of a reasonable quality response surface for the targeted objective function with a computational budget comparable to several single-point evaluations.
arXiv Detail & Related papers (2020-08-10T09:42:59Z)
Preferential Batch Bayesian Optimization [16.141259199997005]
preferential batch Bayesian optimization (PBBO) is a new framework that allows finding the optimum of a latent function of interest. We show how the acquisitions developed under this framework generalize and augment previous approaches in Bayesian optimization.
arXiv Detail & Related papers (2020-03-25T14:59:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.