Dialogue Response Prefetching Based on Semantic Similarity and Prediction Confidence of Language Model
- URL: http://arxiv.org/abs/2508.04403v1
- Date: Wed, 06 Aug 2025 12:45:09 GMT
- Title: Dialogue Response Prefetching Based on Semantic Similarity and Prediction Confidence of Language Model
- Authors: Kiyotada Mori, Seiya Kawano, Angel Fernando Garcia Contreras, Koichiro Yoshino,
- Abstract summary: It is necessary to predict complete user utterances before the end of the user's speech, typically by language models, to prepare prefetched dialogue responses.<n>We propose a prediction confidence model (PCM) that determines whether prefetching is possible or not.
- Score: 2.8521211969963898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prefetching of dialogue responses has been investigated to reduce user-perceived latency (UPL), which refers to the user's waiting time before receiving the system's response, in spoken dialogue systems. To reduce the UPL, it is necessary to predict complete user utterances before the end of the user's speech, typically by language models, to prepare prefetched dialogue responses. In this study, we proposed a prediction confidence model (PCM) that determines whether prefetching is possible or not by estimating the semantic similarity between the predicted complete user utterance and the complete user utterance. We evaluated our PCM based on the differences between the predicted complete user utterance and the complete user utterance.
Related papers
- Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance.
We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information.
Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z) - CAUSE: Counterfactual Assessment of User Satisfaction Estimation in Task-Oriented Dialogue Systems [60.27663010453209]
We leverage large language models (LLMs) to generate satisfaction-aware counterfactual dialogues.
We gather human annotations to ensure the reliability of the generated samples.
Our results shed light on the need for data augmentation approaches for user satisfaction estimation in TOD systems.
arXiv Detail & Related papers (2024-03-27T23:45:31Z) - Personalized Predictive ASR for Latency Reduction in Voice Assistants [29.237198363254752]
We introduce predictive automatic speech recognition, where we predict the full utterance from a partially observed utterance, and prefetch the response based on the predicted utterance.
We evaluate our methods on an internal voice assistant dataset as well as the public SLURP dataset.
arXiv Detail & Related papers (2023-05-23T08:05:43Z) - EM Pre-training for Multi-party Dialogue Response Generation [86.25289241604199]
In multi-party dialogues, the addressee of a response utterance should be specified before it is generated.
We propose an Expectation-Maximization (EM) approach that iteratively performs the expectation steps to generate addressee labels.
arXiv Detail & Related papers (2023-05-21T09:22:41Z) - Proactive Detractor Detection Framework Based on Message-Wise Sentiment
Analysis Over Customer Support Interactions [60.87845704495664]
We propose a framework relying solely on chat-based customer support interactions for predicting the recommendation decision of individual users.
For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America.
Our results show that, with respective feature interpretability, it is possible to predict the likelihood of a user to recommend a product or service, based solely on the message-wise sentiment evolution of their CS conversations in a fully automated way.
arXiv Detail & Related papers (2022-11-08T00:43:36Z) - Turn-Taking Prediction for Natural Conversational Speech [40.189938418201656]
A common conversational utterance often involves multiple queries with turn-taking.
Disfluencies include pausing to think, hesitations, word lengthening, filled pauses and repeated phrases.
We present a turntaking predictor built on top of the end-to-end (E2E) speech recognizer.
arXiv Detail & Related papers (2022-08-29T01:09:23Z) - User Response and Sentiment Prediction for Automatic Dialogue Evaluation [69.11124655437902]
We propose to use the sentiment of the next user utterance for turn or dialog level evaluation.
Experiments show our model outperforming existing automatic evaluation metrics on both written and spoken open-domain dialogue datasets.
arXiv Detail & Related papers (2021-11-16T22:19:17Z) - Improved Goal Oriented Dialogue via Utterance Generation and Look Ahead [5.062869359266078]
intent prediction can be improved by training a deep text-to-text neural model to generate successive user utterances from unlabeled dialogue data.
We present a novel look-ahead approach that uses user utterance generation to improve intent prediction in time.
arXiv Detail & Related papers (2021-10-24T11:12:48Z) - Predict-then-Decide: A Predictive Approach for Wait or Answer Task in
Dialogue Systems [24.560203199376478]
We propose a predictive approach named Predict-then-Decide (PTD) to tackle this Wait-or-Answer problem.
We conduct experiments on two real-life scenarios and three public datasets.
arXiv Detail & Related papers (2020-05-27T01:48:54Z) - A Neural Topical Expansion Framework for Unstructured Persona-oriented
Dialogue Generation [52.743311026230714]
Persona Exploration and Exploitation (PEE) is able to extend the predefined user persona description with semantically correlated content.
PEE consists of two main modules: persona exploration and persona exploitation.
Our approach outperforms state-of-the-art baselines in terms of both automatic and human evaluations.
arXiv Detail & Related papers (2020-02-06T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.