Related papers: Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?

Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?

URL: http://arxiv.org/abs/2406.11065v2
Date: Sat, 28 Sep 2024 05:50:26 GMT
Title: Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?
Authors: Guan-Ting Lin, Hung-yi Lee,
Abstract summary: Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
Score: 64.72966061510375
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. While Large Language Models (LLMs) have revolutionized natural language processing, their ability to understand emphasis in dialogue remains unclear. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various LLMs, both open-source and commercial, to measure their performance in understanding emphasis. Additionally, we propose an automatic evaluation pipeline using GPT-4, which achieves a high correlation with human rating. Our findings reveal that although commercial LLMs generally perform better, there is still significant room for improvement in comprehending emphasized sentences.

Related papers

The Thin Line Between Comprehension and Persuasion in LLMs [0.0]
Large language models (LLMs) are excellent at maintaining high-level, convincing dialogues.<n>We measure how this capability relates to their understanding of what is being talked about.<n>We find that LLMs are capable of maintaining coherent, persuasive debates, often swaying the beliefs of participants and audiences alike.
arXiv Detail & Related papers (2025-07-02T17:46:56Z)
Evaluating Large language models on Understanding Korean indirect Speech acts [0.6757476692230009]
This study evaluates whether current LLMs can understand the intention of an utterance by considering the given conversational context. proprietary models exhibited relatively higher performance compared to open-source models. Most LLMs, except for Claude3-Opus, demonstrated significantly lower performance in understanding indirect speech acts.
arXiv Detail & Related papers (2025-02-16T04:59:19Z)
Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data [33.85748258158527]
Empathetic dialogue is crucial for natural human-computer interaction. Large language models (LLMs) have revolutionized dialogue generation by harnessing their powerful capabilities. We propose a novel approach that circumvents the need for question-answering data.
arXiv Detail & Related papers (2025-01-19T04:10:53Z)
Frozen Large Language Models Can Perceive Paralinguistic Aspects of Speech [29.847183061204436]
Large language models (LLMs) can take into account users' emotions or speaking styles when providing their responses. In this work, we utilize an end-to-end system with a speech encoder. We find that this training framework allows the encoder to generate tokens that capture both semantic and paralinguistic information in speech.
arXiv Detail & Related papers (2024-10-02T01:32:47Z)
Context Does Matter: Implications for Crowdsourced Evaluation Labels in Task-Oriented Dialogue Systems [57.16442740983528]
Crowdsourced labels play a crucial role in evaluating task-oriented dialogue systems. Previous studies suggest using only a portion of the dialogue context in the annotation process. This study investigates the influence of dialogue context on annotation quality.
arXiv Detail & Related papers (2024-04-15T17:56:39Z)
Exploring the Factual Consistency in Dialogue Comprehension of Large Language Models [51.75805497456226]
This work focuses on the factual consistency issue with the help of the dialogue summarization task. Our evaluation shows that, on average, 26.8% of the summaries generated by LLMs contain factual inconsistency. To stimulate and enhance the dialogue comprehension ability of LLMs, we propose a fine-tuning paradigm with auto-constructed multi-task data.
arXiv Detail & Related papers (2023-11-13T09:32:12Z)
BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting. We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance. We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z)
Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response. We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English. Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z)
Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features. To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives. Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.