Related papers: Personalized LLM Decoding via Contrasting Personal Preference

Personalized LLM Decoding via Contrasting Personal Preference

URL: http://arxiv.org/abs/2506.12109v1
Date: Fri, 13 Jun 2025 09:12:44 GMT
Title: Personalized LLM Decoding via Contrasting Personal Preference
Authors: Hyungjune Bu, Chanjoo Jung, Minjae Kang, Jaehyung Kim,
Abstract summary: We propose CoPe, a novel decoding-time approach applied after performing parameter-efficient fine-tuning (PEFT) on user-specific data.<n>Our core idea is to leverage reward-guided decoding specifically for personalization by maximizing each user's implicit reward signal.<n>Our empirical results demonstrate that CoPe achieves strong performance, improving personalization by an average of 10.57% in ROUGE-L.
Score: 8.469329222500726
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As large language models (LLMs) are progressively deployed in various real-world applications, personalization of LLMs has become increasingly important. While various approaches to LLM personalization such as prompt-based and training-based methods have been actively explored, the development of effective decoding-time algorithms remains largely overlooked, despite their demonstrated potential. In this paper, we propose CoPe (Contrasting Personal Preference), a novel decoding-time approach applied after performing parameter-efficient fine-tuning (PEFT) on user-specific data. Our core idea is to leverage reward-guided decoding specifically for personalization by maximizing each user's implicit reward signal. We evaluate CoPe across five open-ended personalized text generation tasks. Our empirical results demonstrate that CoPe achieves strong performance, improving personalization by an average of 10.57% in ROUGE-L, without relying on external reward models or additional training procedures.

Related papers

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
Personalized Language Models via Privacy-Preserving Evolutionary Model Merging [57.161917758405465]
Personalization in large language models (LLMs) seeks to tailor models to individual user or user group preferences.<n>We propose Privacy-Preserving Model Merging via Evolutionary Algorithms (PriME)<n>PriME employs gradient-free methods to directly optimize task-specific metrics while preserving user privacy.
arXiv Detail & Related papers (2025-03-23T09:46:07Z)
Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization [68.79814761867314]
We propose Difference-aware Personalization Learning (DPL) to enhance Large Language Models (LLMs) personalization.<n>DPL strategically selects representative users for comparison and establishes a structured standard to extract task-relevant differences.<n>Experiments on real-world datasets demonstrate that DPL significantly enhances LLM personalization.
arXiv Detail & Related papers (2025-03-04T09:53:26Z)
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users [111.56469697145519]
We propose Few-Shot Preference Optimization, which reframes reward modeling as a meta-learning problem.<n>Under this framework, an LLM learns to quickly adapt to a user via a few labeled preferences from that user, constructing a personalized reward function for them.<n>We generate over 1M synthetic personalized preferences using publicly available LLMs.<n>We evaluate FSPO on personalized open-ended generation for up to 1,500 synthetic users across three domains: movie reviews, pedagogical adaptation based on educational background, and general question answering, along with a controlled human study.
arXiv Detail & Related papers (2025-02-26T17:08:46Z)
Personality-Guided Code Generation Using Large Language Models [14.665759212676488]
We conduct an empirical study on personality-guided code generation using large language models (LLMs)<n>Our results show that personality guidance significantly enhances code generation accuracy, with improved pass rates in 23 out of 28 LLM-dataset combinations.
arXiv Detail & Related papers (2024-10-16T16:42:55Z)
PAD: Personalized Alignment of LLMs at Decoding-Time [10.347782385286582]
This paper presents a novel framework designed to align LLM outputs with diverse personalized preferences during the inference phase.<n>The Personalized Alignment at Decoding-time (PAD) framework decouples the text generation process from personalized preferences.<n>PAD not only outperforms existing training-based alignment methods in terms of aligning with diverse preferences but also shows significant generalizability to preferences unseen during training.
arXiv Detail & Related papers (2024-10-05T08:00:55Z)
Aligning LLMs with Individual Preferences via Interaction [51.72200436159636]
We train large language models (LLMs) that can ''interact to align''<n>We develop a multi-turn preference dataset containing 3K+ multi-turn conversations in tree structures.<n>For evaluation, we establish the ALOE benchmark, consisting of 100 carefully selected examples and well-designed metrics to measure the customized alignment performance during conversations.
arXiv Detail & Related papers (2024-10-04T17:48:29Z)
Guided Profile Generation Improves Personalization with LLMs [3.2685922749445617]
In modern commercial systems, including Recommendation, Ranking, and E-Commerce platforms, there is a trend towards incorporating Personalization context as input into Large Language Models (LLMs) We propose Guided Profile Generation (GPG), a general method designed to generate personal profiles in natural language. Our experimental results show that GPG improves LLM's personalization ability across different tasks, for example, it increases 37% accuracy in predicting personal preference compared to directly feeding the LLMs with raw personal context.
arXiv Detail & Related papers (2024-09-19T21:29:56Z)
Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models [21.115495457454365]
This paper studies an approach to RAG that involves learning user-dependent LLM parameters through parameter-efficient fine-tuning (PEFT)<n>Our results demonstrate that, on average, both RAG- and PEFT-based personalization methods yield 14.92% and 1.07% improvements over non-personalized LLMs, respectively.
arXiv Detail & Related papers (2024-09-14T19:18:26Z)
Few-shot Personalization of LLMs with Mis-aligned Responses [40.0349773257245]
This paper proposes a new approach for a few-shot personalization of large language models (LLMs)<n>Our key idea is to learn a set of personalized prompts for each user by progressively improving the prompts using LLMs.<n>During an iterative process of prompt improvement, we incorporate the contexts of mis-aligned responses by LLMs.
arXiv Detail & Related papers (2024-06-26T18:29:12Z)
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN) At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself. This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.