Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?
- URL: http://arxiv.org/abs/2505.13257v2
- Date: Mon, 29 Sep 2025 16:23:06 GMT
- Title: Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?
- Authors: Zilu Tang, Afra Feyza Akyürek, Ekin Akyürek, Derry Wijaya,
- Abstract summary: A popular trend is adding a prefix to the current user's conversation to steer preference distribution.<n>Most methods passively model personal preferences with prior example preferences pairs.<n>We ask whether models benefit from actively inferring preference descriptions.<n>We then test how effective finetuned 1-8B size models are at inferring and aligning to personal preferences.
- Score: 16.12440288407791
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A prominent issue in aligning language models (LMs) to personalized preferences is underspecification -- the lack of information from users about their preferences. A popular trend of injecting such specification is adding a prefix (e.g. prior relevant conversations) to the current user's conversation to steer preference distribution. Most methods passively model personal preferences with prior example preferences pairs. We ask whether models benefit from actively inferring preference descriptions, and address this question by creating a synthetic personalized alignment dataset based on famous people with known public preferences. We then test how effective finetuned 1-8B size models are at inferring and aligning to personal preferences. Results show that higher-quality active prefixes lead to better generalization, more contextually faithful models, and less systematic biases across different protected attributes. All our results suggest active alignment can lead to a more controllable and efficient path for personalized alignment.
Related papers
- Can LLMs Discern the Traits Influencing Your Preferences? Evaluating Personality-Driven Preference Alignment in LLMs [10.04942779683801]
We study personality as a principled ''latent'' signal behind preference statements.<n>We find that conditioning on personality-aligned preferences substantially improves personalized question answering.<n>We propose a framework that enables an LLM model to automatically retrieve personality-aligned preferences and incorporate them during answer generation.
arXiv Detail & Related papers (2026-02-06T20:37:02Z) - Towards Understanding Valuable Preference Data for Large Language Model Alignment [85.38864561060088]
Large language model (LLM) alignment is typically achieved through learning from human preference comparisons.<n>We assess data quality through individual influence on validation data using our newly proposed truncated influence function (TIF)<n>To this end, we combine them to offset their diverse error sources, resulting in a simple yet effective data selection rule.
arXiv Detail & Related papers (2025-10-15T06:57:55Z) - PrefPalette: Personalized Preference Modeling with Latent Attributes [59.58648056175468]
PrefPalette is a framework that decomposes preferences into attribute dimensions.<n>It tailors its preference prediction to distinct social community values.<n>PrefPalette outperforms GPT-4o by 46.6% in average prediction accuracy.
arXiv Detail & Related papers (2025-07-17T21:21:54Z) - NextQuill: Causal Preference Modeling for Enhancing LLM Personalization [82.15961484963256]
We introduce NextQuill, a novel personalization framework grounded in causal preference modeling.<n>Building on this insight, NextQuill introduces two complementary alignment strategies.<n> Experiments across multiple personalization benchmarks demonstrate that NextQuill significantly improves personalization quality.
arXiv Detail & Related papers (2025-06-03T02:08:55Z) - Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment [21.677859755364334]
Persona-judge is a novel discriminative paradigm that enables training-free personalized alignment with unseen preferences.<n>We show that Persona-judge offers a scalable and computationally efficient solution to personalized alignment.
arXiv Detail & Related papers (2025-04-17T05:50:13Z) - From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment [41.96246165999026]
Large language models (LLMs) have traditionally been aligned through one-size-fits-all approaches.<n>This paper introduces a comprehensive framework for scalable personalized alignment of LLMs.
arXiv Detail & Related papers (2025-03-19T17:41:46Z) - Personalized Preference Fine-tuning of Diffusion Models [75.22218338096316]
We introduce PPD, a multi-reward optimization objective that aligns diffusion models with personalized preferences.<n>With PPD, a diffusion model learns the individual preferences of a population of users in a few-shot way.<n>Our approach achieves an average win rate of 76% over Stable Cascade, generating images that more accurately reflect specific user preferences.
arXiv Detail & Related papers (2025-01-11T22:38:41Z) - Comparison-based Active Preference Learning for Multi-dimensional Personalization [7.349038301460469]
Large language models (LLMs) have shown remarkable success, but aligning them with human preferences remains a core challenge.<n>Recent studies have explored multi-dimensional personalization, which aims to enable models to generate responses personalized to explicit preferences.<n>We propose Active Multi-dimensional Preference Learning (AMPLe), designed to capture implicit user preferences from interactively collected comparative feedback.
arXiv Detail & Related papers (2024-11-01T11:49:33Z) - ComPO: Community Preferences for Language Model Personalization [122.54846260663922]
ComPO is a method to personalize preference optimization in language models.
We collect and release ComPRed, a question answering dataset with community-level preferences from Reddit.
arXiv Detail & Related papers (2024-10-21T14:02:40Z) - PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories [3.0102456679931944]
This paper introduces PREDICT, a method designed to enhance the precision and adaptability of inferring preferences.
We evaluate PREDICT on two distinct environments: a gridworld setting and a new text-domain environment.
arXiv Detail & Related papers (2024-10-08T18:16:41Z) - Unsupervised Human Preference Learning [7.959043497459107]
Large language models demonstrate impressive reasoning abilities but struggle to provide personalized content.
Existing methods, such as in-context learning and parameter-efficient fine-tuning, fall short in capturing the complexity of human preferences.
We propose a novel approach utilizing small parameter models as preference agents to generate natural language rules that guide a larger, pre-trained model.
arXiv Detail & Related papers (2024-09-30T17:51:01Z) - DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement [53.86523017756224]
We present DegustaBot, an algorithm for visual preference learning that solves household multi-object rearrangement tasks according to personal preference.
We collect a large dataset of naturalistic personal preferences in a simulated table-setting task.
We find that 50% of our model's predictions are likely to be found acceptable by at least 20% of people.
arXiv Detail & Related papers (2024-07-11T21:28:02Z) - Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization [105.3612692153615]
We propose a new axis based on eliciting preferences jointly over instruction-response pairs.<n>Joint preferences over instruction and response pairs can significantly enhance the alignment of large language models.
arXiv Detail & Related papers (2024-03-31T02:05:40Z) - Dissecting Human and LLM Preferences [80.55271307662365]
We find that humans are less sensitive to errors, favor responses that support their stances, and show clear dislike when models admit their limits.
advanced LLMs like GPT-4-Turbo emphasize correctness, clarity, and harmlessness more.
We show that preference-based evaluation can be intentionally manipulated.
arXiv Detail & Related papers (2024-02-17T14:34:31Z) - Personalized Language Modeling from Personalized Human Feedback [45.16986573937782]
Personalized large language models (LLMs) are designed to tailor responses to individual user preferences.<n>We propose Personalized-RLHF, an efficient framework that utilizes a lightweight user model to capture individual user preferences.<n>We show that personalized LLMs trained using P-RLHF generate responses that are more closely aligned with individual user preferences.
arXiv Detail & Related papers (2024-02-06T04:18:58Z) - Personalized Soups: Personalized Large Language Model Alignment via
Post-hoc Parameter Merging [148.77027765872006]
We study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem.
LLMs are aligned to multiple preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem.
We show that we can achieve personalized alignment by decomposing preferences into multiple dimensions.
arXiv Detail & Related papers (2023-10-17T20:22:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.