Related papers: Aligning LLMs with Individual Preferences via Interaction

Aligning LLMs with Individual Preferences via Interaction

URL: http://arxiv.org/abs/2410.03642v1
Date: Fri, 4 Oct 2024 17:48:29 GMT
Title: Aligning LLMs with Individual Preferences via Interaction
Authors: Shujin Wu, May Fung, Cheng Qian, Jeonghwan Kim, Dilek Hakkani-Tur, Heng Ji,
Abstract summary: We train large language models (LLMs) that can ''interact to align'' We develop a multi-turn preference dataset containing 3K+ multi-turn conversations in tree structures. For evaluation, we establish the ALOE benchmark, consisting of 100 carefully selected examples and well-designed metrics to measure the customized alignment performance during conversations.
Score: 51.72200436159636
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: As large language models (LLMs) demonstrate increasingly advanced capabilities, aligning their behaviors with human values and preferences becomes crucial for their wide adoption. While previous research focuses on general alignment to principles such as helpfulness, harmlessness, and honesty, the need to account for individual and diverse preferences has been largely overlooked, potentially undermining customized human experiences. To address this gap, we train LLMs that can ''interact to align'', essentially cultivating the meta-skill of LLMs to implicitly infer the unspoken personalized preferences of the current user through multi-turn conversations, and then dynamically align their following behaviors and responses to these inferred preferences. Our approach involves establishing a diverse pool of 3,310 distinct user personas by initially creating seed examples, which are then expanded through iterative self-generation and filtering. Guided by distinct user personas, we leverage multi-LLM collaboration to develop a multi-turn preference dataset containing 3K+ multi-turn conversations in tree structures. Finally, we apply supervised fine-tuning and reinforcement learning to enhance LLMs using this dataset. For evaluation, we establish the ALOE (ALign With CustOmized PrEferences) benchmark, consisting of 100 carefully selected examples and well-designed metrics to measure the customized alignment performance during conversations. Experimental results demonstrate the effectiveness of our method in enabling dynamic, personalized alignment via interaction.

Related papers

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions [50.70965714314064]
Large Language Models (LLMs) are increasingly serving as personal assistants, where users share complex and diverse preferences over extended interactions.<n>This work proposes RealPref, a benchmark for evaluating realistic preference-following in personalized user-LLM interactions.
arXiv Detail & Related papers (2026-03-04T15:42:43Z)
Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z)
CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions [39.554239954719876]
CUPID is a benchmark of 756 human-curated interaction session histories.<n>We evaluate 10 open and proprietary Large Language Models (LLMs)<n>Our work highlights the need to advance LLM capabilities for more contextually personalized interactions.
arXiv Detail & Related papers (2025-08-03T09:04:48Z)
A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations [112.81207927088117]
PersonaConvBench is a benchmark for evaluating personalized reasoning and generation in multi-turn conversations with large language models (LLMs)<n>We benchmark several commercial and open-source LLMs under a unified prompting setup and observe that incorporating personalized history yields substantial performance improvements.
arXiv Detail & Related papers (2025-05-20T09:13:22Z)
HyPerAlign: Hypotheses-driven Personalized Alignment [24.67727411391369]
We propose a hypotheses-driven personalization approach (HyPerAlign) for large language models (LLMs) For deliberative alignment, the helpfulness of LLM models is improved by up to $70%$ on average. For authorship attribution, results indicate consistently high win-rates (commonly $>90%$) against state-of-the-art preference fine-tuning approaches.
arXiv Detail & Related papers (2025-04-29T18:01:46Z)
Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization [68.79814761867314]
We propose Difference-aware Personalization Learning (DPL) to enhance Large Language Models (LLMs) personalization. DPL strategically selects representative users for comparison and establishes a structured standard to extract task-relevant differences. Experiments on real-world datasets demonstrate that DPL significantly enhances LLM personalization.
arXiv Detail & Related papers (2025-03-04T09:53:26Z)
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem. Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them. We generate over 1M synthetic personalized preferences using publicly available LLMs. We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
LLM-assisted Explicit and Implicit Multi-interest Learning Framework for Sequential Recommendation [50.98046887582194]
We propose an explicit and implicit multi-interest learning framework to model user interests on two levels: behavior and semantics. The proposed EIMF framework effectively and efficiently combines small models with LLM to improve the accuracy of multi-interest modeling.
arXiv Detail & Related papers (2024-11-14T13:00:23Z)
Active Preference-based Learning for Multi-dimensional Personalization [7.349038301460469]
Large language models (LLMs) have shown remarkable versatility across tasks, but aligning them with individual human preferences remains challenging. We propose an active preference learning framework that uses binary feedback to estimate user preferences across multiple objectives. We validate our approach through theoretical analysis and experiments on language generation tasks, demonstrating its feedback efficiency and effectiveness in personalizing model responses.
arXiv Detail & Related papers (2024-11-01T11:49:33Z)
MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time [50.41806216615488]
Large Language Models (LLMs) acquire extensive knowledge and remarkable abilities from extensive text corpora. To make LLMs more usable, aligning them with human preferences is essential. We propose an effective method, textbf MetaAlign, which aims to help LLMs dynamically align with various explicit or implicit preferences specified at inference time.
arXiv Detail & Related papers (2024-10-18T05:31:13Z)
PersonalLLM: Tailoring LLMs to Individual Preferences [11.717169516971856]
We present a public benchmark, PersonalLLM, focusing on adapting LLMs to provide maximal benefits for a particular user. We curate open-ended prompts paired with many high-quality answers over which users would be expected to display heterogeneous latent preferences. Our dataset and generated personalities offer an innovative testbed for developing personalization algorithms.
arXiv Detail & Related papers (2024-09-30T13:55:42Z)
Personality Alignment of Large Language Models [26.071445846818914]
Current methods for aligning large language models (LLMs) typically aim to reflect general human values and behaviors. We introduce the concept of Personality Alignment. This approach tailors LLMs' responses and decisions to match the specific preferences of individual users or closely related groups.
arXiv Detail & Related papers (2024-08-21T17:09:00Z)
Few-shot Personalization of LLMs with Mis-aligned Responses [40.0349773257245]
This paper proposes a new approach for a few-shot personalization of large language models (LLMs) Our key idea is to learn a set of personalized prompts for each user by progressively improving the prompts using LLMs. During an iterative process of prompt improvement, we incorporate the contexts of mis-aligned responses by LLMs.
arXiv Detail & Related papers (2024-06-26T18:29:12Z)
Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search [9.243535345193711]
Our method uses large language models to guide a single human worker in generating personalized dialogues. LAPS can collect large-scale, human-written, multi-session, and multi-domain conversations. Our results show that responses generated explicitly using extracted preferences better match user's actual preferences.
arXiv Detail & Related papers (2024-05-06T13:53:03Z)
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment [103.12563033438715]
Alignment in artificial intelligence pursues consistency between model responses and human preferences as well as values. Existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives.
arXiv Detail & Related papers (2024-02-29T12:12:30Z)
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts. RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z)
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging [148.77027765872006]
We study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem. LLMs are aligned to multiple preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem. We show that we can achieve personalized alignment by decomposing preferences into multiple dimensions.
arXiv Detail & Related papers (2023-10-17T20:22:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.