Modeling User Satisfaction Dynamics in Dialogue via Hawkes Process
- URL: http://arxiv.org/abs/2305.12594v1
- Date: Sun, 21 May 2023 23:04:14 GMT
- Title: Modeling User Satisfaction Dynamics in Dialogue via Hawkes Process
- Authors: Fanghua Ye, Zhiyuan Hu, Emine Yilmaz
- Abstract summary: We propose a new estimator that treats user satisfaction across turns as an event sequence and employs a Hawkes process to effectively model the dynamics in this sequence.
Experimental results on four benchmark dialogue datasets demonstrate that ASAP can substantially outperform state-of-the-art baseline estimators.
- Score: 17.477718698071424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dialogue systems have received increasing attention while automatically
evaluating their performance remains challenging. User satisfaction estimation
(USE) has been proposed as an alternative. It assumes that the performance of a
dialogue system can be measured by user satisfaction and uses an estimator to
simulate users. The effectiveness of USE depends heavily on the estimator.
Existing estimators independently predict user satisfaction at each turn and
ignore satisfaction dynamics across turns within a dialogue. In order to fully
simulate users, it is crucial to take satisfaction dynamics into account. To
fill this gap, we propose a new estimator ASAP (sAtisfaction eStimation via
HAwkes Process) that treats user satisfaction across turns as an event sequence
and employs a Hawkes process to effectively model the dynamics in this
sequence. Experimental results on four benchmark dialogue datasets demonstrate
that ASAP can substantially outperform state-of-the-art baseline estimators.
Related papers
- Simulating User Agents for Embodied Conversational-AI [9.402740034754455]
We build a large language model (LLM)-based user agent that can simulate user behavior during interactions with an embodied agent.
We evaluate our user agent's ability to generate human-like behaviors by comparing its simulated dialogues with the TEACh dataset.
arXiv Detail & Related papers (2024-10-31T00:56:08Z) - CAUSE: Counterfactual Assessment of User Satisfaction Estimation in Task-Oriented Dialogue Systems [60.27663010453209]
We leverage large language models (LLMs) to generate satisfaction-aware counterfactual dialogues.
We gather human annotations to ensure the reliability of the generated samples.
Our results shed light on the need for data augmentation approaches for user satisfaction estimation in TOD systems.
arXiv Detail & Related papers (2024-03-27T23:45:31Z) - Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems [2.788542465279969]
This paper introduces DAUS, a Domain-Aware User Simulator.
We fine-tune DAUS on real examples of task-oriented dialogues.
Results on two relevant benchmarks showcase significant improvements in terms of user goal fulfillment.
arXiv Detail & Related papers (2024-02-20T20:57:47Z) - Enhancing Large Language Model Induced Task-Oriented Dialogue Systems
Through Look-Forward Motivated Goals [76.69419538047813]
ProToD approach anticipates the future dialogue actions and incorporates the goal-oriented reward signal to enhance ToD systems.
We present a novel evaluation method that assesses ToD systems based on goal-driven dialogue simulations.
Empirical experiments conducted on the MultiWoZ 2.1 dataset demonstrate that our model can achieve superior performance using only 10% of the data.
arXiv Detail & Related papers (2023-09-16T10:56:00Z) - Unlocking the Potential of User Feedback: Leveraging Large Language
Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model.
This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models.
Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z) - Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with
User Simulator [37.590563896382456]
We propose an interactive evaluation framework for Task-Oriented Dialogue (TOD) systems.
We first build a goal-oriented user simulator based on pre-trained models and then use the user simulator to interact with the dialogue system to generate dialogues.
Experimental results show that RL-based TOD systems trained by our proposed user simulator can achieve nearly 98% inform and success rates.
arXiv Detail & Related papers (2022-10-26T07:41:32Z) - User Satisfaction Estimation with Sequential Dialogue Act Modeling in
Goal-oriented Conversational Systems [65.88679683468143]
We propose a novel framework, namely USDA, to incorporate the sequential dynamics of dialogue acts for predicting user satisfaction.
USDA incorporates the sequential transitions of both content and act features in the dialogue to predict the user satisfaction.
Experimental results on four benchmark goal-oriented dialogue datasets show that the proposed method substantially and consistently outperforms existing methods on USE.
arXiv Detail & Related papers (2022-02-07T02:50:07Z) - Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy
Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning.
ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation.
Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z) - Optimizing Interactive Systems via Data-Driven Objectives [70.3578528542663]
We propose an approach that infers the objective directly from observed user interactions.
These inferences can be made regardless of prior knowledge and across different types of user behavior.
We introduce Interactive System (ISO), a novel algorithm that uses these inferred objectives for optimization.
arXiv Detail & Related papers (2020-06-19T20:49:14Z) - Improving Interaction Quality Estimation with BiLSTMs and the Impact on
Dialogue Policy Learning [0.6538911223040175]
We propose a novel reward based on user satisfaction estimation.
We show that it outperforms all previous estimators while learning temporal dependencies implicitly.
We show that applying this model results in higher estimated satisfaction, similar task success rates and a higher robustness to noise.
arXiv Detail & Related papers (2020-01-21T15:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.