Group Preference Alignment: Customized LLM Response Generation from In-Situ Conversations
- URL: http://arxiv.org/abs/2503.08035v1
- Date: Tue, 11 Mar 2025 04:32:54 GMT
- Title: Group Preference Alignment: Customized LLM Response Generation from In-Situ Conversations
- Authors: Ishani Mondal, Jack W. Stokes, Sujay Kumar Jauhar, Longqi Yang, Mengting Wan, Xiaofeng Xu, Xia Song, Jennifer Neville,
- Abstract summary: Group Preference Alignment identifies context-specific variations in conversational preferences across user groups.<n>Our framework significantly improves alignment of the output with respect to user preferences and outperforms baseline methods.
- Score: 36.29709573877113
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: LLMs often fail to meet the specialized needs of distinct user groups due to their one-size-fits-all training paradigm \cite{lucy-etal-2024-one} and there is limited research on what personalization aspects each group expect. To address these limitations, we propose a group-aware personalization framework, Group Preference Alignment (GPA), that identifies context-specific variations in conversational preferences across user groups and then steers LLMs to address those preferences. Our approach consists of two steps: (1) Group-Aware Preference Extraction, where maximally divergent user-group preferences are extracted from real-world conversation logs and distilled into interpretable rubrics, and (2) Tailored Response Generation, which leverages these rubrics through two methods: a) Context-Tuned Inference (GAP-CT), that dynamically adjusts responses via context-dependent prompt instructions, and b) Rubric-Finetuning Inference (GPA-FT), which uses the rubrics to generate contrastive synthetic data for personalization of group-specific models via alignment. Experiments demonstrate that our framework significantly improves alignment of the output with respect to user preferences and outperforms baseline methods, while maintaining robust performance on standard benchmarks.
Related papers
- Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment [74.25832963097658]
Multi-Objective Alignment (MOA) aims to align responses with multiple human preference objectives.
We find that DPO-based MOA approaches suffer from widespread preference conflicts in the data.
arXiv Detail & Related papers (2025-02-20T08:27:00Z) - A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.<n>With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)<n>Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.<n>High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z) - Beyond the Binary: Capturing Diverse Preferences With Reward Regularization [15.518838657050173]
We argue that this reliance on binary choices does not capture the broader, aggregate preferences of the target user in real-world tasks.<n>We introduce a simple yet effective method that augments existing binary preference datasets with synthetic preference judgments to estimate potential user disagreement.
arXiv Detail & Related papers (2024-12-05T02:35:46Z) - Unleashing the Power of Large Language Models for Group POI Recommendations [39.49785677738477]
Group Point-of-Interest (POI) recommendations aim to predict the next POI that satisfies the diverse preferences of a group of users.
Existing methods for group POI recommendations rely on single ID-based features from check-in data.
We propose a framework that unleashes power of the Large Language Model (LLM) for context-aware group POI recommendations.
arXiv Detail & Related papers (2024-11-20T16:02:14Z) - GPRec: Bi-level User Modeling for Deep Recommenders [45.38687843911628]
GPRec explicitly categorizes users into groups in a learnable manner and aligns them with corresponding group embeddings.
On the individual level, GPRec identifies personal preferences from ID-like features and refines the obtained individual representations to be independent of group ones.
Rigorous testing of GPRec on three public datasets has demonstrated significant improvements in recommendation quality.
arXiv Detail & Related papers (2024-10-28T04:49:05Z) - Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization [105.3612692153615]
We propose a new axis based on eliciting preferences jointly over instruction-response pairs.<n>Joint preferences over instruction and response pairs can significantly enhance the alignment of large language models.
arXiv Detail & Related papers (2024-03-31T02:05:40Z) - Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts.
RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z) - Group Preference Optimization: Few-Shot Alignment of Large Language Models [28.464834028110538]
Group Preference Optimization steers language models to preferences of individual groups in a few-shot manner.
We empirically validate the efficacy of GPO through rigorous evaluations using large language models with varied sizes.
Our results demonstrate that GPO not only aligns models more accurately but also requires fewer group-specific preferences, and less training and inference computing resources.
arXiv Detail & Related papers (2023-10-17T18:41:57Z) - Overcoming Data Sparsity in Group Recommendation [52.00998276970403]
Group recommender systems should be able to accurately learn not only users' personal preferences but also preference aggregation strategy.
In this paper, we take Bipartite Graphding Model (BGEM), the self-attention mechanism and Graph Convolutional Networks (GCNs) as basic building blocks to learn group and user representations in a unified way.
arXiv Detail & Related papers (2020-10-02T07:11:19Z) - GroupIM: A Mutual Information Maximization Framework for Neural Group
Recommendation [24.677145454396822]
We study the problem of making item recommendations to ephemeral groups, which comprise users with limited or no historical activities together.
Existing studies target persistent groups with substantial activity history, while ephemeral groups lack historical interactions.
We propose data-driven regularization strategies to exploit both the preference covariance amongst users who are in the same group, as well as the contextual relevance of users' individual preferences to each group.
arXiv Detail & Related papers (2020-06-05T23:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.