Related papers: Optimizing Interactive Systems via Data-Driven Objectives

Related papers

Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z)
Personas within Parameters: Fine-Tuning Small Language Models with Low-Rank Adapters to Mimic User Behaviors [1.8352113484137629]
A long-standing challenge in developing accurate recommendation models is simulating user behavior, mainly due to the complex nature of user interactions.<n>We propose an approach to extracting robust user representations using a frozen Large Language Models (LLMs) and simulating cost-effective, resource-efficient user agents powered by fine-tuned Small Language Models (SLMs)<n>Our experiments provide compelling empirical evidence of the efficacy of our methods, demonstrating that user agents developed using our approach have the potential to bridge the gap between offline metrics and real-world performance of recommender systems.
arXiv Detail & Related papers (2025-08-18T22:14:57Z)
System Prompt Optimization with Meta-Learning [60.04718679054704]
We introduce the novel problem of bilevel system prompt optimization, whose objective is to design system prompts that are robust to diverse user prompts.<n>We propose a meta-learning framework, which meta-learns the system prompt by optimizing it over various user prompts across multiple datasets.<n>We conduct experiments on 14 unseen datasets spanning 5 different domains, on which we show that our approach produces system prompts that generalize effectively to diverse user prompts.
arXiv Detail & Related papers (2025-05-14T16:46:15Z)
Multi-agents based User Values Mining for Recommendation [52.26100802380767]
We propose a zero-shot multi-LLM collaborative framework for effective and accurate user value extraction.<n>We apply text summarization techniques to condense item content while preserving essential meaning.<n>To mitigate hallucinations, we introduce two specialized agent roles: evaluators and supervisors.
arXiv Detail & Related papers (2025-05-02T04:01:31Z)
Search-Based Interaction For Conversation Recommendation via Generative Reward Model Based Simulated User [117.82681846559909]
Conversational recommendation systems (CRSs) use multi-turn interaction to capture user preferences and provide personalized recommendations. We propose a generative reward model based simulated user, named GRSU, for automatic interaction with CRSs.
arXiv Detail & Related papers (2025-04-29T06:37:30Z)
Simulating Before Planning: Constructing Intrinsic User World Model for User-Tailored Dialogue Policy Planning [31.785493263807684]
We present the User-Tailored Dialogue Policy Planning (UDP) framework, which incorporates an Intrinsic User World Model to model user traits and feedback. UDP operates in three stages: (1) User Persona Portraying, using a diffusion model to dynamically infer user profiles; (2) User Feedback Anticipating, leveraging a Brownian Bridge-inspired anticipator to predict user reactions; and (3) User-Tailored Policy Planning, integrating these insights to optimize response strategies.
arXiv Detail & Related papers (2025-04-18T11:48:55Z)
Uncertain Multi-Objective Recommendation via Orthogonal Meta-Learning Enhanced Bayesian Optimization [30.031396809114625]
We introduce a novel framework that categorizes RS autonomy into five distinct levels, ranging from basic rule-based accuracy-driven systems to behavior-aware, uncertain multi-objective RSs. We propose an approach that dynamically identifies and optimize multiple objectives based on individual user preferences, fostering more ethical and intelligent user-centric recommendations.
arXiv Detail & Related papers (2025-02-18T08:10:09Z)
Retrieval Augmentation via User Interest Clustering [57.63883506013693]
Industrial recommender systems are sensitive to the patterns of user-item engagement. We propose a novel approach that efficiently constructs user interest and facilitates low computational cost inference. Our approach has been deployed in multiple products at Meta, facilitating short-form video related recommendation.
arXiv Detail & Related papers (2024-08-07T16:35:10Z)
Deep Pareto Reinforcement Learning for Multi-Objective Recommender Systems [60.91599969408029]
optimizing multiple objectives simultaneously is an important task for recommendation platforms. Existing multi-objective recommender systems do not systematically consider such dynamic relationships.
arXiv Detail & Related papers (2024-07-04T02:19:49Z)
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems [2.788542465279969]
This paper introduces DAUS, a Domain-Aware User Simulator. We fine-tune DAUS on real examples of task-oriented dialogues. Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment.
arXiv Detail & Related papers (2024-02-20T20:57:47Z)
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents [110.25679611755962]
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions. We introduce Intention-in-Interaction (IN3), a novel benchmark designed to inspect users' implicit intentions through explicit queries. We empirically train Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals.
arXiv Detail & Related papers (2024-02-14T14:36:30Z)
Integrating Human Expertise in Continuous Spaces: A Novel Interactive Bayesian Optimization Framework with Preference Expected Improvement [0.5148939336441986]
Interactive Machine Learning (IML) seeks to integrate human expertise into machine learning processes. We propose a novel framework based on Bayesian Optimization (BO) BO enables collaboration between machine learning algorithms and humans.
arXiv Detail & Related papers (2024-01-23T11:14:59Z)
Our Model Achieves Excellent Performance on MovieLens: What Does it Mean? [43.3971105361606]
We conduct a meticulous analysis of the MovieLens dataset. There are significant differences in user interactions at the different stages when a user interacts with the MovieLens platform. We discuss the discrepancy between the interaction generation mechanism that is employed by the MovieLens system and that of typical real-world recommendation scenarios.
arXiv Detail & Related papers (2023-07-19T13:44:32Z)
Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy [83.61404191470126]
We propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors. The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.
arXiv Detail & Related papers (2022-04-07T14:11:31Z)
What Does The User Want? Information Gain for Hierarchical Dialogue Policy Optimisation [3.1433893853959605]
optimisation via reinforcement learning (RL) is susceptible to sample inefficiency and instability. We propose the usage of an intrinsic reward based on information gain to address this issue. Our algorithm, which we call FeudalGain, achieves state-of-the-art results in most environments of the PyDial framework.
arXiv Detail & Related papers (2021-09-15T07:21:26Z)
Empowering Active Learning to Jointly Optimize System and User Demands [70.66168547821019]
We propose a new active learning approach that jointly optimize the active learning system (training efficiently) and the user (receiving useful instances) We study our approach in an educational application, which particularly benefits from this technique as the system needs to rapidly learn to predict the appropriateness of an exercise to a particular user. We evaluate multiple learning strategies and user types with data from real users and find that our joint approach better satisfies both objectives when alternative methods lead to many unsuitable exercises for end users.
arXiv Detail & Related papers (2020-05-09T16:02:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.