Aligning LLMs by Predicting Preferences from User Writing Samples
- URL: http://arxiv.org/abs/2505.23815v1
- Date: Tue, 27 May 2025 20:20:20 GMT
- Title: Aligning LLMs by Predicting Preferences from User Writing Samples
- Authors: Stéphane Aroca-Ouellette, Natalie Mackraz, Barry-John Theobald, Katherine Metcalf,
- Abstract summary: This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples.<n>We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER by 33%.
- Score: 2.357769830358414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accommodating human preferences is essential for creating aligned LLM agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs acting as writing agents to infer a description of user preferences. Agent alignment then comes from conditioning on the inferred preference description. However, existing methods often produce generic preference descriptions that fail to capture the unique and individualized nature of human preferences. This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples. PROSE incorporates two key elements: (1) iterative refinement of inferred preferences, and (2) verification of inferred preferences across multiple user writing samples. We evaluate PROSE with several LLMs (i.e., Qwen2.5 7B and 72B Instruct, GPT-mini, and GPT-4o) on a summarization and an email writing task. We find that PROSE more accurately infers nuanced human preferences, improving the quality of the writing agent's generations over CIPHER (a state-of-the-art method for inferring preferences) by 33\%. Lastly, we demonstrate that ICL and PROSE are complementary methods, and combining them provides up to a 9\% improvement over ICL alone.
Related papers
- Debiasing Online Preference Learning via Preference Feature Preservation [64.55924745257951]
Recent preference learning frameworks simplify human preferences with binary pairwise comparisons and scalar rewards.<n>This could make large language models' responses biased to mostly preferred features, and would be exacerbated during the iterations of online preference learning steps.<n>We propose Preference Feature Preservation to maintain the distribution of human preference features and utilize such rich signals throughout the online preference learning process.
arXiv Detail & Related papers (2025-06-06T13:19:07Z) - HyPerAlign: Interpretable Personalized LLM Alignment via Hypothesis Generation [24.67727411391369]
HyPerAlign is an interpretable and sample-efficient hypothesis-driven personalization approach for large language models.<n>We conduct experiments on two different personalization tasks, namely authorship attribution and deliberative alignment.<n>Results demonstrate the superiority of hypothesis-driven personalization compared to preference-based fine-tuning methods.
arXiv Detail & Related papers (2025-04-29T18:01:46Z) - FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem.<n>Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them.<n>We generate over 1M synthetic personalized preferences using publicly available LLMs.<n>We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z) - ULMRec: User-centric Large Language Model for Sequential Recommendation [16.494996929730927]
We propose ULMRec, a framework that integrates user personalized preferences into Large Language Models.<n>Extensive experiments on two public datasets demonstrate that ULMRec significantly outperforms existing methods.
arXiv Detail & Related papers (2024-12-07T05:37:00Z) - MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time [50.41806216615488]
Large Language Models (LLMs) acquire extensive knowledge and remarkable abilities from extensive text corpora.
To make LLMs more usable, aligning them with human preferences is essential.
We propose an effective method, textbf MetaAlign, which aims to help LLMs dynamically align with various explicit or implicit preferences specified at inference time.
arXiv Detail & Related papers (2024-10-18T05:31:13Z) - PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories [3.0102456679931944]
This paper introduces PREDICT, a method designed to enhance the precision and adaptability of inferring preferences.
We evaluate PREDICT on two distinct environments: a gridworld setting and a new text-domain environment.
arXiv Detail & Related papers (2024-10-08T18:16:41Z) - Orchestrating LLMs with Different Personalizations [28.344891363780576]
This paper presents a novel approach to aligning large language models (LLMs) with individual human preferences.
Given stated preferences along multiple dimensions, such as helpfulness, conciseness, or humor, the goal is to create an LLM without re-training that best adheres to this specification.
Starting from specialized expert LLMs, each trained for one particular preference dimension, we propose a black-box method that merges their outputs on a per-token level.
arXiv Detail & Related papers (2024-07-04T22:55:02Z) - Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment [72.99676237703099]
We propose a new framework that boosts the alignment of large language models with human preferences.<n>Our key idea is leveraging the human prior knowledge within the small (seed) data.<n>We introduce a noise-aware preference learning algorithm to mitigate the risk of low quality within generated preference data.
arXiv Detail & Related papers (2024-06-06T18:01:02Z) - Aligning LLM Agents by Learning Latent Preference from User Edits [23.235995078727658]
We study interactive learning of language agents based on user edits made to the agent's output.
We propose a learning framework, PRELUDE, that infers a description of the user's latent preference based on historic edit data.
We introduce two interactive environments -- summarization and email writing, and use a GPT-4 simulated user for evaluation.
arXiv Detail & Related papers (2024-04-23T17:57:47Z) - Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization [105.3612692153615]
We propose a new axis based on eliciting preferences jointly over instruction-response pairs.<n>Joint preferences over instruction and response pairs can significantly enhance the alignment of large language models.
arXiv Detail & Related papers (2024-03-31T02:05:40Z) - Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts.
RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.