LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue
- URL: http://arxiv.org/abs/2503.04675v1
- Date: Thu, 06 Mar 2025 18:12:33 GMT
- Title: LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue
- Authors: Sangyeop Kim, Sohhyung Park, Jaewon Jung, Jinseok Kim, Sungzoon Cho,
- Abstract summary: PRAISE is an interpretable framework for effective user satisfaction prediction.<n>It operates through three key modules.<n>It achieves state-of-the-art performance on three benchmarks for the User Satisfaction Estimation task.
- Score: 5.070104802923903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding user satisfaction with conversational systems, known as User Satisfaction Estimation (USE), is essential for assessing dialogue quality and enhancing user experiences. However, existing methods for USE face challenges due to limited understanding of underlying reasons for user dissatisfaction and the high costs of annotating user intentions. To address these challenges, we propose PRAISE (Plan and Retrieval Alignment for Interpretable Satisfaction Estimation), an interpretable framework for effective user satisfaction prediction. PRAISE operates through three key modules. The Strategy Planner develops strategies, which are natural language criteria for classifying user satisfaction. The Feature Retriever then incorporates knowledge on user satisfaction from Large Language Models (LLMs) and retrieves relevance features from utterances. Finally, the Score Analyzer evaluates strategy predictions and classifies user satisfaction. Experimental results demonstrate that PRAISE achieves state-of-the-art performance on three benchmarks for the USE task. Beyond its superior performance, PRAISE offers additional benefits. It enhances interpretability by providing instance-level explanations through effective alignment of utterances with strategies. Moreover, PRAISE operates more efficiently than existing approaches by eliminating the need for LLMs during the inference phase.
Related papers
- UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering [39.79275025010785]
name is a benchmark designed to evaluate the effectiveness of user embeddings in prompting large language models for personalization.<n>We conduct extensive experiments on various state-of-the-art methods for modeling user embeddings.
arXiv Detail & Related papers (2025-02-26T14:34:00Z) - PredictaBoard: Benchmarking LLM Score Predictability [50.47497036981544]
Large Language Models (LLMs) often fail unpredictably.<n>This poses a significant challenge to ensuring their safe deployment.<n>We present PredictaBoard, a novel collaborative benchmarking framework.
arXiv Detail & Related papers (2025-02-20T10:52:38Z) - HPSS: Heuristic Prompting Strategy Search for LLM Evaluators [81.09765876000208]
We propose a novel automatic prompting strategy optimization method called Heuristic Prompting Strategy Search (HPSS)<n>Inspired by the genetic algorithm, HPSS conducts an iterative search to find well-behaved prompting strategies for evaluators.<n>Extensive experiments across four evaluation tasks demonstrate the effectiveness of HPSS.
arXiv Detail & Related papers (2025-02-18T16:46:47Z) - Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment [69.11529841118671]
We propose a new Deliberative Recommendation task, which incorporates explicit reasoning about user preferences as an additional alignment goal.<n>We then introduce the Reasoning-powered Recommender framework for deliberative user preference alignment.
arXiv Detail & Related papers (2025-02-04T07:17:54Z) - Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation [21.769969074938142]
EXP3RT is a novel LLM-based recommender designed to leverage rich preference information contained in user and item reviews.
It generates detailed step-by-step reasoning followed by predicted rating.
Experiments show that EXP3RT outperforms existing methods on both rating prediction and candidate item reranking for top-k recommendation.
arXiv Detail & Related papers (2024-08-12T16:39:03Z) - CAUSE: Counterfactual Assessment of User Satisfaction Estimation in Task-Oriented Dialogue Systems [60.27663010453209]
We leverage large language models (LLMs) to generate satisfaction-aware counterfactual dialogues.
We gather human annotations to ensure the reliability of the generated samples.
Our results shed light on the need for data augmentation approaches for user satisfaction estimation in TOD systems.
arXiv Detail & Related papers (2024-03-27T23:45:31Z) - Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models [35.95405294377247]
Existing approaches based on featurized ML models or text embeddings fall short in extracting generalizable patterns.
We show that LLMs can extract interpretable signals of user satisfaction from their natural language utterances more effectively than embedding-based approaches.
arXiv Detail & Related papers (2024-03-19T02:57:07Z) - DRDT: Dynamic Reflection with Divergent Thinking for LLM-based
Sequential Recommendation [53.62727171363384]
We introduce a novel reasoning principle: Dynamic Reflection with Divergent Thinking.
Our methodology is dynamic reflection, a process that emulates human learning through probing, critiquing, and reflecting.
We evaluate our approach on three datasets using six pre-trained LLMs.
arXiv Detail & Related papers (2023-12-18T16:41:22Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Towards User Guided Actionable Recourse [5.669106489320257]
Actionable Recourse (AR) describes recommendations of cost-efficient changes to a user's actionable features to help them obtain favorable outcomes.
We propose a gradient-based approach to identify User Preferred Actionable Recourse (UP-AR)
arXiv Detail & Related papers (2023-09-05T18:06:09Z) - Reward Shaping for User Satisfaction in a REINFORCE Recommender [24.65853598093849]
We propose a policy network and a satisfaction imputation network to learn which actions are satisfying to the user.
The role of the imputation network is to learn which actions are satisfying to the user; while the policy network, built on top of REINFORCE, decides which items to recommend.
arXiv Detail & Related papers (2022-09-30T01:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.