Related papers: Consecutive Preferential Bayesian Optimization

Consecutive Preferential Bayesian Optimization

URL: http://arxiv.org/abs/2511.05163v1
Date: Fri, 07 Nov 2025 11:30:36 GMT
Title: Consecutive Preferential Bayesian Optimization
Authors: Aras Erarslan, Carlos Sevilla Salcedo, Ville Tanskanen, Anni Nisov, Eero Päiväkumpu, Heikki Aisala, Kaisu Honkapää, Arto Klami, Petrus Mikkola,
Abstract summary: We generalize preference-based optimization to account for production and evaluation costs.<n>We empirically demonstrate a notable increase in accuracy in setups with high production costs or with indifference feedback.
Score: 5.048216954459151
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Preferential Bayesian optimization allows optimization of objectives that are either expensive or difficult to measure directly, by relying on a minimal number of comparative evaluations done by a human expert. Generating candidate solutions for evaluation is also often expensive, but this cost is ignored by existing methods. We generalize preference-based optimization to explicitly account for production and evaluation costs with Consecutive Preferential Bayesian Optimization, reducing production cost by constraining comparisons to involve previously generated candidates. We also account for the perceptual ambiguity of the oracle providing the feedback by incorporating a Just-Noticeable Difference threshold into a probabilistic preference model to capture indifference to small utility differences. We adapt an information-theoretic acquisition strategy to this setting, selecting new configurations that are most informative about the unknown optimum under a preference model accounting for the perceptual ambiguity. We empirically demonstrate a notable increase in accuracy in setups with high production costs or with indifference feedback.

Related papers

Autocorrelated Optimize-via-Estimate: Predict-then-Optimize versus Finite-sample Optimal [2.0228793142608588]
Models that directly optimize for out-of-sample performance in the finite-sample regime have emerged as a promising alternative to traditional estimate-then-optimize approaches.<n>We compare their performance in the context of autocorrelated uncertainties, specifically, under a Vector Autoregressive Moving Average VARMA(p,q) process.
arXiv Detail & Related papers (2026-02-02T09:49:51Z)
Cost-Optimal Active AI Model Evaluation [71.2069549142394]
Development of generative AI systems requires continual evaluation, data acquisition, and annotation.<n>We develop novel, cost-aware methods for actively balancing the use of a cheap, but often inaccurate, weak rater.<n>We derive a family of cost-optimal policies for allocating a given annotation budget between weak and strong raters.
arXiv Detail & Related papers (2025-06-09T17:14:41Z)
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts [51.74394601039711]
Preference alignment methods are increasingly critical for steering large language models to generate outputs consistent with human values.<n>We propose a novel distribution-aware optimization framework that improves preference alignment despite such shifts.
arXiv Detail & Related papers (2025-04-08T09:14:38Z)
Dissecting the Impact of Model Misspecification in Data-driven Optimization [20.35205476800932]
Data-driven optimization aims to translate a machine learning model into decision-making by optimizing decisions on estimated costs.<n>A more recent approach uses estimation-optimization integration that minimizes decision error instead of estimation error.<n>We show how the integrated approach offers a universal double benefit'' on the top two dominating terms of regret when the underlying model is misspecified.
arXiv Detail & Related papers (2025-03-01T21:31:54Z)
Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models [79.84205827056907]
We present Self-Steering Optimization ($SSO$), an algorithm that autonomously generates high-quality preference data.<n>$SSO$ employs a specialized optimization objective to build a data generator from the policy model itself, which is used to produce accurate and on-policy data.<n>Our evaluation shows that $SSO$ consistently outperforms baselines in human preference alignment and reward optimization.
arXiv Detail & Related papers (2024-10-22T16:04:03Z)
$i$REPO: $i$mplicit Reward Pairwise Difference based Empirical Preference Optimization [12.266207199002604]
Large Language Models (LLM) can sometimes produce outputs that deviate from human expectations. We propose a novel framework named $i$REPO, which utilizes implicit Reward pairwise difference regression for Empirical Preference Optimization. We show that $i$REPO effectively achieves self-alignment using soft-label, self-generated responses and the logit of empirical AI annotators.
arXiv Detail & Related papers (2024-05-24T05:42:11Z)
Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime [59.27851754647913]
Predictive optimization is the precise modeling of many real-world applications, including energy cost-aware scheduling and budget allocation on advertising. We develop a modular framework to benchmark 11 existing PtO/PnO methods on 8 problems, including a new industrial dataset for advertising. Our study shows that PnO approaches are better than PtO on 7 out of 8 benchmarks, but there is no silver bullet found for the specific design choices of PnO.
arXiv Detail & Related papers (2023-11-13T13:19:34Z)
Rate-Optimal Policy Optimization for Linear Markov Decision Processes [65.5958446762678]
We obtain rate-optimal $widetilde O (sqrt K)$ regret where $K$ denotes the number of episodes. Our work is the first to establish the optimal (w.r.t.$K$) rate of convergence in the setting with bandit feedback. No algorithm with an optimal rate guarantee is currently known.
arXiv Detail & Related papers (2023-08-28T15:16:09Z)
SnAKe: Bayesian Optimization with Pathwise Exploration [9.807656882149319]
We consider a novel setting where the expense of evaluating the function can increase significantly when making large input changes between iterations. This paper investigates the problem and introduces 'Sequential Bayesian Optimization via Adaptive Connecting Samples' (SnAKe) It provides a solution by considering future queries and preemptively building optimization paths that minimize input costs.
arXiv Detail & Related papers (2022-01-31T19:42:56Z)
Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs [28.254408148839644]
We propose a non-myopic acquisition function that generalizes classical expected improvement to the setting of heterogeneous evaluation costs. Our acquisition function outperforms existing methods in a variety of synthetic and real problems.
arXiv Detail & Related papers (2021-11-12T02:18:26Z)
Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions [74.00030431081751]
We formalize the notion of user-specific cost functions and introduce a new method for identifying actionable recourses for users. Our method satisfies up to 25.89 percentage points more users compared to strong baseline methods.
arXiv Detail & Related papers (2021-11-01T19:49:35Z)
Fast Rates for Contextual Linear Optimization [52.39202699484225]
We show that a naive plug-in approach achieves regret convergence rates that are significantly faster than methods that directly optimize downstream decision performance. Our results are overall positive for practice: predictive models are easy and fast to train using existing tools, simple to interpret, and, as we show, lead to decisions that perform very well.
arXiv Detail & Related papers (2020-11-05T18:43:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.