Related papers: Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

URL: http://arxiv.org/abs/2505.04260v2
Date: Tue, 13 May 2025 21:19:59 GMT
Title: Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering
Authors: Jessica Y. Bo, Tianyu Xu, Ishan Chatterjee, Katrina Passarella-Ward, Achin Kulshrestha, D Shin,
Abstract summary: We leverage activation steering to guide large language models to align with user preferences during inference.<n>In contrast to memory-based personalization methods that require longer user history, steering is extremely lightweight and can be easily controlled by the user.<n>Results demonstrate the effectiveness of preference-based steering for aligning real-world conversations with hidden user preferences.
Score: 4.3537491807568465
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As large language models (LLMs) improve in their capacity to serve as personal AI assistants, their ability to output uniquely tailored, personalized responses that align with the soft preferences of their users is essential for enhancing user satisfaction and retention. However, untrained lay users have poor prompt specification abilities and often struggle with conveying their latent preferences to AI assistants. To address this, we leverage activation steering to guide LLMs to align with interpretable preference dimensions during inference. In contrast to memory-based personalization methods that require longer user history, steering is extremely lightweight and can be easily controlled by the user via an linear strength factor. We embed steering into three different interactive chatbot interfaces and conduct a within-subjects user study (n=14) to investigate how end users prefer to personalize their conversations. The results demonstrate the effectiveness of preference-based steering for aligning real-world conversations with hidden user preferences, and highlight further insights on how diverse values around control, usability, and transparency lead users to prefer different interfaces.

Related papers

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions [50.70965714314064]
Large Language Models (LLMs) are increasingly serving as personal assistants, where users share complex and diverse preferences over extended interactions.<n>This work proposes RealPref, a benchmark for evaluating realistic preference-following in personalized user-LLM interactions.
arXiv Detail & Related papers (2026-03-04T15:42:43Z)
SteerX: Disentangled Steering for LLM Personalization [75.89038195784701]
Large language models (LLMs) have shown remarkable success in recent years, enabling a wide range of applications.<n>A critical factor in building such assistants is personalizing LLMs, as user preferences and needs vary widely.<n>We propose SteerX, a method that isolates preference-driven components from preference-agnostic components.
arXiv Detail & Related papers (2025-10-25T11:26:20Z)
Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance [35.15965694815852]
Open-domain dialogue systems aim to generate natural and engaging conversations.<n>Existing large language models (LLMs) fall short in proactively understanding the user's chatting preferences.<n>We propose a User-oriented Proactive (UPC) to enhance the user-oriented proactivity.
arXiv Detail & Related papers (2025-05-18T09:59:22Z)
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale [51.9706400130481]
Large Language Models (LLMs) have emerged as personalized assistants for users across a wide range of tasks.<n> PERSONAMEM features curated user profiles with over 180 simulated user-LLM interaction histories.<n>We evaluate LLM chatbots' ability to identify the most suitable response according to the current state of the user's profile.
arXiv Detail & Related papers (2025-04-19T08:16:10Z)
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward [11.495697919066341]
Policy agents must be able to personalize their behavior to suit a user's preferences, personality, and attributes.<n>Current training methods like Reinforcement Learning from Human Feedback (RLHF) prioritize helpfulness and safety but fall short in fostering truly empathetic, adaptive, and personalized interactions.<n>We propose to incorporate an intrinsic motivation to improve the conversational agents's model of the user as an additional reward alongside multi-turn RLHF.
arXiv Detail & Related papers (2025-04-04T06:35:02Z)
Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization [68.79814761867314]
We propose Difference-aware Personalization Learning (DPL) to enhance Large Language Models (LLMs) personalization.<n>DPL strategically selects representative users for comparison and establishes a structured standard to extract task-relevant differences.<n>Experiments on real-world datasets demonstrate that DPL significantly enhances LLM personalization.
arXiv Detail & Related papers (2025-03-04T09:53:26Z)
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering [39.79275025010785]
name is a benchmark designed to evaluate the effectiveness of user embeddings in prompting large language models for personalization.<n>We conduct extensive experiments on various state-of-the-art methods for modeling user embeddings.
arXiv Detail & Related papers (2025-02-26T14:34:00Z)
A Survey of Personalized Large Language Models: Progress and Future Directions [86.45576419251302]
Large Language Models (LLMs) excel in handling general knowledge tasks, yet struggle with user-specific personalization.<n> Personalized Large Language Models (PLLMs) tackle these challenges by leveraging individual user data.<n>PLLMs can significantly enhance user satisfaction and have broad applications in conversational agents, systems, emotion recognition, medical assistants, and more.
arXiv Detail & Related papers (2025-02-17T07:58:31Z)
Unveiling User Preferences: A Knowledge Graph and LLM-Driven Approach for Conversational Recommendation [55.5687800992432]
We propose a plug-and-play framework that synergizes Large Language Models (LLMs) and Knowledge Graphs (KGs) to unveil user preferences.<n>This enables the LLM to transform KG entities into concise natural language descriptions, allowing them to comprehend domain-specific knowledge.
arXiv Detail & Related papers (2024-11-16T11:47:21Z)
Optimizing Data Delivery: Insights from User Preferences on Visuals, Tables, and Text [59.68239795065175]
We conduct a user study where users are shown a question and asked what they would prefer to see. We use the data to establish that a user's personal traits does influence the data outputs that they prefer.
arXiv Detail & Related papers (2024-11-12T00:24:31Z)
Aligning LLMs with Individual Preferences via Interaction [51.72200436159636]
We train large language models (LLMs) that can ''interact to align''<n>We develop a multi-turn preference dataset containing 3K+ multi-turn conversations in tree structures.<n>For evaluation, we establish the ALOE benchmark, consisting of 100 carefully selected examples and well-designed metrics to measure the customized alignment performance during conversations.
arXiv Detail & Related papers (2024-10-04T17:48:29Z)
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts [95.09994361995389]
Relative Preference Optimization (RPO) is designed to discern between more and less preferred responses derived from both identical and related prompts. RPO has demonstrated a superior ability to align large language models with user preferences and to improve their adaptability during the training process.
arXiv Detail & Related papers (2024-02-12T22:47:57Z)
Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning [36.88126051792774]
Personalization in large language models (LLMs) is increasingly important.<n>One PEFT Per User (OPPU) employs personalized parameter-efficient fine-tuning (PEFT) modules to store user-specific behavior patterns and preferences.<n>OPPU significantly outperforms existing prompt-based methods across seven diverse tasks in the LaMP benchmark.
arXiv Detail & Related papers (2024-02-06T21:03:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.