Efficient Personalization of Generative Models via Optimal Experimental Design
- URL: http://arxiv.org/abs/2512.19057v1
- Date: Mon, 22 Dec 2025 05:47:25 GMT
- Title: Efficient Personalization of Generative Models via Optimal Experimental Design
- Authors: Guy Schacht, Ziyad Sheebaelhamd, Riccardo De Santi, Mojmír Mutný, Andreas Krause,
- Abstract summary: We formulate the problem of preference query selection as the one that maximizes the information about the underlying latent preference model.<n>We show that this problem has a convex optimization formulation, and introduce a statistically and computationally efficient algorithm ED-PBRL.<n>We empirically present the proposed framework by personalizing a text-to-image generative model to user-specific styles, showing that it requires less preference queries compared to random query selection.
- Score: 31.83801602641749
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Preference learning from human feedback has the ability to align generative models with the needs of end-users. Human feedback is costly and time-consuming to obtain, which creates demand for data-efficient query selection methods. This work presents a novel approach that leverages optimal experimental design to ask humans the most informative preference queries, from which we can elucidate the latent reward function modeling user preferences efficiently. We formulate the problem of preference query selection as the one that maximizes the information about the underlying latent preference model. We show that this problem has a convex optimization formulation, and introduce a statistically and computationally efficient algorithm ED-PBRL that is supported by theoretical guarantees and can efficiently construct structured queries such as images or text. We empirically present the proposed framework by personalizing a text-to-image generative model to user-specific styles, showing that it requires less preference queries compared to random query selection.
Related papers
- Autocorrelated Optimize-via-Estimate: Predict-then-Optimize versus Finite-sample Optimal [2.0228793142608588]
Models that directly optimize for out-of-sample performance in the finite-sample regime have emerged as a promising alternative to traditional estimate-then-optimize approaches.<n>We compare their performance in the context of autocorrelated uncertainties, specifically, under a Vector Autoregressive Moving Average VARMA(p,q) process.
arXiv Detail & Related papers (2026-02-02T09:49:51Z) - Inference-Time Personalized Alignment with a Few User Preference Queries [24.28598841525897]
We study the problem of aligning a generative model's response with a user's preferences.<n>We propose UserAlign, that elicits the user's preferences with a few queries as pairwise response comparisons.
arXiv Detail & Related papers (2025-11-04T20:07:03Z) - Personalized Recommendations via Active Utility-based Pairwise Sampling [1.704905100460915]
We propose a utility-based framework that learns preferences from simple and intuitive pairwise comparisons.<n>A central contribution of our work is a novel utility-based active sampling strategy for preference elicitation.
arXiv Detail & Related papers (2025-08-12T19:09:33Z) - Comparison-based Active Preference Learning for Multi-dimensional Personalization [7.349038301460469]
Large language models (LLMs) have shown remarkable success, but aligning them with human preferences remains a core challenge.<n>Recent studies have explored multi-dimensional personalization, which aims to enable models to generate responses personalized to explicit preferences.<n>We propose Active Multi-dimensional Preference Learning (AMPLe), designed to capture implicit user preferences from interactively collected comparative feedback.
arXiv Detail & Related papers (2024-11-01T11:49:33Z) - An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization.
We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons.
Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction.
This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants.
We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions.
The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z) - Top-N Recommendation with Counterfactual User Preference Simulation [26.597102553608348]
Top-N recommendation, which aims to learn user ranking-based preference, has long been a fundamental problem in a wide range of applications.
In this paper, we propose to reformulate the recommendation task within the causal inference framework to handle the data scarce problem.
arXiv Detail & Related papers (2021-09-02T14:28:46Z) - Human Preference-Based Learning for High-dimensional Optimization of
Exoskeleton Walking Gaits [55.59198568303196]
This work presents LineCoSpar, a human-in-the-loop preference-based framework to learn user preferences in high dimensions.
In simulations and human trials, we empirically verify that LineCoSpar is a sample-efficient approach for high-dimensional preference optimization.
This result has implications for exoskeleton gait synthesis, an active field with applications to clinical use and patient rehabilitation.
arXiv Detail & Related papers (2020-03-13T22:02:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.