Styles + Persona-plug = Customized LLMs
- URL: http://arxiv.org/abs/2601.06362v1
- Date: Sat, 10 Jan 2026 00:14:43 GMT
- Title: Styles + Persona-plug = Customized LLMs
- Authors: Yutong Song, Jiang Wu, Shaofan Yuan, Chengze Shen, Jian Wang, Amir Rahmani, Nikil Dutt, Yu Wang,
- Abstract summary: We formulate personalization as a distributional residual and propose PsPLUG, a lightweight soft-prompt plug-in trained with style-conditioned preference contrasts.<n>Across LaMP benchmark, our framework improves persona alignment, maintains stylistic fidelity, and outperforms retrieval-based and soft-prompt baselines.
- Score: 9.655863963736921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We discover a previously overlooked challenge in personalized text generation: personalization methods are increasingly applied under explicit style instructions, yet their behavior under such constraints remains poorly understood. To balance implicit personalization and explicit style, we formulate personalization as a distributional residual and propose PsPLUG, a lightweight soft-prompt plug-in trained with style-conditioned preference contrasts. Across LaMP benchmark, our framework improves persona alignment, maintains stylistic fidelity, and outperforms retrieval-based and soft-prompt baselines with minimal computation. These results show that residual modeling provides a simple and principled foundation for controllable, style-aware LLM personalization.
Related papers
- Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z) - One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment [55.86333374784959]
We argue that addressing these constraints requires a paradigm shift from fitting data to learn user preferences to learn the process of preference adaptation.<n>We propose Meta Reward Modeling (MRM), which reformulates personalized reward modeling as a meta-learning problem.<n>We show that MRM enhances few-shot personalization, improves user robustness, and consistently outperforms baselines.
arXiv Detail & Related papers (2026-01-26T17:55:52Z) - When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs [13.695058536403108]
We show that when personalized large language models (LLMs) face factual queries, the model generates answers aligned with a user's prior history rather than the objective truth.<n>We propose Factuality-Preserving Personalized Steering (FPPS), a lightweight inference-time approach that mitigates personalization-induced factual distortions.
arXiv Detail & Related papers (2026-01-16T05:20:10Z) - Reflective Personalization Optimization: A Post-hoc Rewriting Framework for Black-Box Large Language Models [16.152962349146275]
We propose Reflective Personalization Optimization (RPO), a framework that redefines the personalization paradigm by decoupling content generation from alignment.<n>RPO operates in two distinct stages: first, a base model generates a high-quality, generic response; then, an external reflection module explicitly rewrites this output to align with the user's preferences.<n> Comprehensive experiments on the LaMP benchmark demonstrate that RPO, by decoupling content generation from personalization, significantly outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-11-07T14:48:49Z) - Iterative Critique-Refine Framework for Enhancing LLM Personalization [67.77803308645511]
We present PerFine, a unified, training-free critique-refine framework for personalized text generation.<n>In each iteration, an LLM generator produces a draft conditioned on a retrieved profile, and a critic LLM - also conditioned on the same profile - provides structured feedback on tone, vocabulary, sentence structure, and topicality.<n>Across Yelp, Goodreads, and Amazon datasets, PerFine consistently improves personalization over PGraphRAG.
arXiv Detail & Related papers (2025-10-28T14:36:22Z) - POPI: Personalizing LLMs via Optimized Natural Language Preference Inference [42.25870704040321]
POPI is a general framework that introduces a preference inference model to distill heterogeneous user signals into concise natural language summaries.<n>These summaries act as transparent, compact, and transferable personalization representations that condition a shared generation model to produce personalized responses.<n>Extensive experiments across four personalization benchmarks demonstrate that POPI consistently improves personalization accuracy while reducing context overhead by a large margin.
arXiv Detail & Related papers (2025-10-17T23:07:57Z) - StyleAdaptedLM: Enhancing Instruction Following Models with Efficient Stylistic Transfer [4.077787659104315]
StyleAdaptedLM is a framework that efficiently transfers stylistic traits to instruction-following models.<n>LoRA adapters are first trained on a base model with diverse unstructured stylistic corpora, then merged with a separate instruction-following model.
arXiv Detail & Related papers (2025-07-24T10:57:32Z) - Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization [68.79814761867314]
We propose Difference-aware Personalization Learning (DPL) to enhance Large Language Models (LLMs) personalization.<n>DPL strategically selects representative users for comparison and establishes a structured standard to extract task-relevant differences.<n>Experiments on real-world datasets demonstrate that DPL significantly enhances LLM personalization.
arXiv Detail & Related papers (2025-03-04T09:53:26Z) - Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning [74.56097953187994]
We present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks.<n>TICL iteratively expands an in-context learning prompt via a trial-error-explain process, adding model-generated negative samples and explanations.<n>TICL achieves up to 91.5% against the previous state-of-the-art and outperforms competitive tuning-free baselines for personalized alignment tasks.
arXiv Detail & Related papers (2025-02-13T05:20:21Z) - PAD: Personalized Alignment of LLMs at Decoding-Time [10.347782385286582]
This paper presents a novel framework designed to align LLM outputs with diverse personalized preferences during the inference phase.<n>The Personalized Alignment at Decoding-time (PAD) framework decouples the text generation process from personalized preferences.<n>PAD not only outperforms existing training-based alignment methods in terms of aligning with diverse preferences but also shows significant generalizability to preferences unseen during training.
arXiv Detail & Related papers (2024-10-05T08:00:55Z) - Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback [70.32795295142648]
Linear alignment is a novel algorithm that aligns language models with human preferences in one single inference step.
Experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment.
arXiv Detail & Related papers (2024-01-21T10:46:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.