Related papers: Personalized Language Modeling from Personalized Human Feedback

Personalized Language Modeling from Personalized Human Feedback

URL: http://arxiv.org/abs/2402.05133v2
Date: Sun, 7 Jul 2024 19:31:21 GMT
Title: Personalized Language Modeling from Personalized Human Feedback
Authors: Xinyu Li, Zachary C. Lipton, Liu Leqi,
Abstract summary: Reinforcement Learning from Human Feedback (RLHF) is commonly used to fine-tune large language models to better align with human preferences. In this work, we aim to address this problem by developing methods for building personalized language models.
Score: 49.344833339240566
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement Learning from Human Feedback (RLHF) is commonly used to fine-tune large language models to better align with human preferences. However, the underlying premise of algorithms developed under this framework can be problematic when user preferences encoded in human feedback are diverse. In this work, we aim to address this problem by developing methods for building personalized language models. We first formally introduce the task of learning from personalized human feedback and explain why vanilla RLHF can be ineffective in this context. We then propose a general Personalized-RLHF (P-RLHF) framework, including a user model that maps user information to user representations and can flexibly encode our assumptions on user preferences. We develop new learning objectives to perform personalized Direct Preference Optimization that jointly learns a user model and a personalized language model. We demonstrate the efficacy of our proposed method through (1) a synthetic task where we fine-tune a GPT-J 6B model to align with users with conflicting preferences on generation length; and (2) an instruction following task where we fine-tune a Tulu-7B model to generate responses for users with diverse preferences on the style of responses. In both cases, our learned models can generate personalized responses that are better aligned with the preferences of individual users.

Related papers

HyPerAlign: Hypotheses-driven Personalized Alignment [24.67727411391369]
We propose a hypotheses-driven personalization approach (HyPerAlign) for large language models (LLMs) For deliberative alignment, the helpfulness of LLM models is improved by up to $70%$ on average. For authorship attribution, results indicate consistently high win-rates (commonly $>90%$) against state-of-the-art preference fine-tuning approaches.
arXiv Detail & Related papers (2025-04-29T18:01:46Z)
LoRe: Personalizing LLMs via Low-Rank Reward Modeling [47.12507639759984]
We introduce a novel framework that leverages low-rank preference modeling to efficiently learn and generalize user-specific reward functions. We validate our method on multiple preference datasets, demonstrating superior generalization to unseen users and improved accuracy in preference prediction tasks.
arXiv Detail & Related papers (2025-04-20T01:16:24Z)
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale [51.9706400130481]
Large Language Models (LLMs) have emerged as personalized assistants for users across a wide range of tasks. PERSONAMEM features curated user profiles with over 180 simulated user-LLM interaction histories. We evaluate LLM chatbots' ability to identify the most suitable response according to the current state of the user's profile.
arXiv Detail & Related papers (2025-04-19T08:16:10Z)
Language Model Personalization via Reward Factorization [38.30745045315918]
We introduce a framework that extends RLHF to enable user personalization. We represent user-specific rewards as a linear combination of base reward functions. In human evaluations, our method achieves a 67% win rate over default GPT-4o responses.
arXiv Detail & Related papers (2025-03-08T23:41:20Z)
Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization [68.79814761867314]
We propose Difference-aware Personalization Learning (DPL) to enhance Large Language Models (LLMs) personalization. DPL strategically selects representative users for comparison and establishes a structured standard to extract task-relevant differences. Experiments on real-world datasets demonstrate that DPL significantly enhances LLM personalization.
arXiv Detail & Related papers (2025-03-04T09:53:26Z)
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem. Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them. We generate over 1M synthetic personalized preferences using publicly available LLMs. We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z)
ComPO: Community Preferences for Language Model Personalization [122.54846260663922]
ComPO is a method to personalize preference optimization in language models. We collect and release ComPRed, a question answering dataset with community-level preferences from Reddit.
arXiv Detail & Related papers (2024-10-21T14:02:40Z)
Personalized Adaptation via In-Context Preference Learning [20.042909385219716]
Preference Pretrained Transformer (PPT) is a novel approach for adaptive personalization using online user feedback. Our results suggest the potential of in-context learning for scalable and efficient personalization in large language models.
arXiv Detail & Related papers (2024-10-17T20:06:02Z)
Unsupervised Human Preference Learning [7.959043497459107]
Large language models demonstrate impressive reasoning abilities but struggle to provide personalized content. Existing methods, such as in-context learning and parameter-efficient fine-tuning, fall short in capturing the complexity of human preferences. We propose a novel approach utilizing small parameter models as preference agents to generate natural language rules that guide a larger, pre-trained model.
arXiv Detail & Related papers (2024-09-30T17:51:01Z)
PersonalLLM: Tailoring LLMs to Individual Preferences [11.717169516971856]
We present a public benchmark, PersonalLLM, focusing on adapting LLMs to provide maximal benefits for a particular user. We curate open-ended prompts paired with many high-quality answers over which users would be expected to display heterogeneous latent preferences. Our dataset and generated personalities offer an innovative testbed for developing personalization algorithms.
arXiv Detail & Related papers (2024-09-30T13:55:42Z)
LLMs + Persona-Plug = Personalized LLMs [41.60364110693824]
Personalization plays a critical role in numerous language tasks and applications, since users with the same requirements may prefer diverse outputs based on their individual interests. This has led to the development of various personalized approaches aimed at adapting large language models (LLMs) to generate customized outputs aligned with user preferences. We propose a novel personalized LLM model, ours. It constructs a user-specific embedding for each individual by modeling all her historical contexts through a lightweight plug-in user embedder module.
arXiv Detail & Related papers (2024-09-18T11:54:45Z)
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning [12.742158403867002]
Reinforcement Learning from Human Feedback is a powerful paradigm for aligning foundation models to human values and preferences. Current RLHF techniques cannot account for the naturally occurring differences in individual human preferences across a diverse population. We develop a class of multimodal RLHF methods to address the need for pluralistic alignment.
arXiv Detail & Related papers (2024-08-19T15:18:30Z)
Aligning Large Language Models from Self-Reference AI Feedback with one General Principle [61.105703857868775]
We propose a self-reference-based AI feedback framework that enables a 13B Llama2-Chat to provide high-quality feedback. Specifically, we allow the AI to first respond to the user's instructions, then generate criticism of other answers based on its own response as a reference. Finally, we determine which answer better fits human preferences according to the criticism.
arXiv Detail & Related papers (2024-06-17T03:51:46Z)
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging [148.77027765872006]
We study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem. LLMs are aligned to multiple preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem. We show that we can achieve personalized alignment by decomposing preferences into multiple dimensions.
arXiv Detail & Related papers (2023-10-17T20:22:13Z)
Chain of Hindsight Aligns Language Models with Feedback [62.68665658130472]
We propose a novel technique, Chain of Hindsight, that is easy to optimize and can learn from any form of feedback, regardless of its polarity. We convert all types of feedback into sequences of sentences, which are then used to fine-tune the model. By doing so, the model is trained to generate outputs based on feedback, while learning to identify and correct negative attributes or errors.
arXiv Detail & Related papers (2023-02-06T10:28:16Z)
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences. We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model. We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z)
Learning Implicit User Profiles for Personalized Retrieval-Based Chatbot [29.053654530024083]
IMPChat aims to learn an implicit user profile through modeling user's personalized language style and personalized preferences separately. To learn a user's personalized language style, we elaborately build language models from shallow to deep using the user's historical responses. We match each response candidate with the personalized language style and personalized preference, respectively, and fuse the two matching signals to determine the final ranking score.
arXiv Detail & Related papers (2021-08-18T02:07:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.