Related papers: Enhancing conversational quality in language learning chatbots: An evaluation of GPT4 for ASR error correction

Enhancing conversational quality in language learning chatbots: An evaluation of GPT4 for ASR error correction

URL: http://arxiv.org/abs/2307.09744v1
Date: Wed, 19 Jul 2023 04:25:21 GMT
Title: Enhancing conversational quality in language learning chatbots: An evaluation of GPT4 for ASR error correction
Authors: Long Mai and Julie Carson-Berndsen
Abstract summary: This paper explores the use of GPT4 for ASR error correction in conversational settings. We find that transcriptions corrected by GPT4 lead to higher conversation quality, despite an increase in WER.
Score: 20.465220855548292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The integration of natural language processing (NLP) technologies into educational applications has shown promising results, particularly in the language learning domain. Recently, many spoken open-domain chatbots have been used as speaking partners, helping language learners improve their language skills. However, one of the significant challenges is the high word-error-rate (WER) when recognizing non-native/non-fluent speech, which interrupts conversation flow and leads to disappointment for learners. This paper explores the use of GPT4 for ASR error correction in conversational settings. In addition to WER, we propose to use semantic textual similarity (STS) and next response sensibility (NRS) metrics to evaluate the impact of error correction models on the quality of the conversation. We find that transcriptions corrected by GPT4 lead to higher conversation quality, despite an increase in WER. GPT4 also outperforms standard error correction methods without the need for in-domain training data.

Related papers

Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI [27.56203179880491]
General-purpose automatic speech recognition (ASR) systems do not always perform well in goal-oriented dialogue. We extend correction to tasks that have no prior user data and exhibit linguistic flexibility such as lexical and syntactic variations.
arXiv Detail & Related papers (2025-01-10T17:35:06Z)
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition [110.8431434620642]
We introduce the generative speech transcription error correction (GenSEC) challenge. This challenge comprises three post-ASR language modeling tasks: (i) post-ASR transcription correction, (ii) speaker tagging, and (iii) emotion recognition. We discuss insights from baseline evaluations, as well as lessons learned for designing future evaluations.
arXiv Detail & Related papers (2024-09-15T16:32:49Z)
Keyword-Aware ASR Error Augmentation for Robust Dialogue State Tracking [17.96115263146684]
We introduce a simple yet effective data augmentation method to improve robustness of Dialogue State Tracking model. Our method generates sufficient error patterns on keywords, leading to improved accuracy in noised and low-accuracy ASR environments.
arXiv Detail & Related papers (2024-09-10T07:06:40Z)
Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? [64.72966061510375]
Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
arXiv Detail & Related papers (2024-06-16T20:41:44Z)
Generative error correction for code-switching speech recognition using large language models [49.06203730433107]
Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence. We propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem.
arXiv Detail & Related papers (2023-10-17T14:49:48Z)
ChatGPT-4 as a Tool for Reviewing Academic Books in Spanish [1.0052074659955383]
ChatGPT-4 is an artificial intelligence language model developed by OpenAI. This study evaluates the potential of ChatGPT-4 as an editing tool for Spanish literary and academic books.
arXiv Detail & Related papers (2023-09-20T11:44:45Z)
Does Correction Remain A Problem For Large Language Models? [63.24433996856764]
This paper investigates the role of correction in the context of large language models by conducting two experiments. The first experiment focuses on correction as a standalone task, employing few-shot learning techniques with GPT-like models for error correction. The second experiment explores the notion of correction as a preparatory task for other NLP tasks, examining whether large language models can tolerate and perform adequately on texts containing certain levels of noise or errors.
arXiv Detail & Related papers (2023-08-03T14:09:31Z)
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction [50.51901599433536]
DisfluencyFixer is a tool that performs speech-to-speech disfluency correction in English and Hindi. Our proposed system removes disfluencies from input speech and returns fluent speech as output.
arXiv Detail & Related papers (2023-05-26T14:13:38Z)
Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation [41.94480044074273]
ChatGPT is a large-scale language model based on the advanced GPT-3.5 architecture. We design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT. Our evaluation involves assessing ChatGPT's performance on five official test sets in three different languages, along with three document-level GEC test sets in English.
arXiv Detail & Related papers (2023-04-04T12:33:40Z)
Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition [12.354292498112347]
We present further improvements over our previous work by using domain adversarial learning to train task models. Our proposed technique leads to reductions in Word Error Rates (WER) in monolingual and code-switched test sets across three language pairs.
arXiv Detail & Related papers (2020-06-09T13:45:30Z)
On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.