Frame of Reference: Addressing the Challenges of Common Ground Representation in Situational Dialogs
- URL: http://arxiv.org/abs/2601.09365v1
- Date: Wed, 14 Jan 2026 10:45:22 GMT
- Title: Frame of Reference: Addressing the Challenges of Common Ground Representation in Situational Dialogs
- Authors: Biswesh Mohapatra, Théo Charlot, Giovanni Duca, Mayank Palan, Laurent Romary, Justine Cassell,
- Abstract summary: Common ground plays a critical role in situated spoken dialogues, where interlocutors must maintain shared references to entities, events, and relations to sustain coherent interaction.<n>We evaluate a model's ability to establish and exploit common ground through relational references to entities within the shared context in a situational dialogue.
- Score: 2.730457204085116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Common ground plays a critical role in situated spoken dialogues, where interlocutors must establish and maintain shared references to entities, events, and relations to sustain coherent interaction. For dialog systems, the ability to correctly ground conversational content in order to refer back to it later is particularly important. Prior studies have demonstrated that LLMs are capable of performing grounding acts such as requesting clarification or producing acknowledgments, yet relatively little work has investigated how common ground can be explicitly represented and stored for later use. Without such mechanisms, it remains unclear whether acknowledgment or clarification behaviors truly reflect a grounded understanding. In this work, we evaluate a model's ability to establish and exploit common ground through relational references to entities within the shared context in a situational dialogue. We test multiple methods for representing common ground in situated dialogues and further propose approaches to improve both the establishment of common ground and its subsequent use in the conversation.
Related papers
- Understanding Common Ground Misalignment in Goal-Oriented Dialog: A Case-Study with Ubuntu Chat Logs [21.001649494291712]
We study failures of grounding in the Ubuntu IRC dataset, where participants use text-only communication to resolve technical issues.<n>We find that disruptions in conversational flow often stem from a misalignment in common ground, driven by a divergence in beliefs and assumptions held by participants.
arXiv Detail & Related papers (2025-03-16T06:19:44Z) - Analysing Cross-Speaker Convergence in Face-to-Face Dialogue through the Lens of Automatically Detected Shared Linguistic Constructions [4.216085185442862]
This study applies a methodology for automatically detecting shared lemmatised constructions to a referential communication corpus.
Our analyses uncover the usage patterns of shared constructions in interaction and reveal that features such as their frequency and the amount of different constructions used for a referent are associated with the degree of object labelling convergence.
arXiv Detail & Related papers (2024-05-14T12:34:25Z) - Common Ground Tracking in Multimodal Dialogue [13.763043173931024]
We present a method for automatically identifying the current set of shared beliefs and questions under discussion'' (QUDs) of a group with a shared goal.
We annotate a dataset of multimodal interactions in a shared physical space with speech transcriptions, prosodic features, gestures, actions, and facets of collaboration.
We cascade into a set of formal closure rules derived from situated evidence and belief axioms and update operations.
arXiv Detail & Related papers (2024-03-26T00:25:01Z) - Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units [3.805394793605586]
We present the annotation of two dialog corpora employing Grounding Acts, Grounding Units, and a measure of their degree of grounding.
Our work aims to make conversations with machines better understood and more reliable in natural day-to-day collaborative dialogs.
arXiv Detail & Related papers (2024-03-25T10:39:18Z) - 'What are you referring to?' Evaluating the Ability of Multi-Modal
Dialogue Models to Process Clarificational Exchanges [65.03196674816772]
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee.
Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarification Exchanges (CE): a Clarification Request (CR) and a response.
Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models.
arXiv Detail & Related papers (2023-07-28T13:44:33Z) - MindDial: Belief Dynamics Tracking with Theory-of-Mind Modeling for Situated Neural Dialogue Generation [62.44907105496227]
MindDial is a novel conversational framework that can generate situated free-form responses with theory-of-mind modeling.
We introduce an explicit mind module that can track the speaker's belief and the speaker's prediction of the listener's belief.
Our framework is applied to both prompting and fine-tuning-based models, and is evaluated across scenarios involving both common ground alignment and negotiation.
arXiv Detail & Related papers (2023-06-27T07:24:32Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge [19.812562421377706]
We propose a theory-based evaluation method for investigating to what degree models pretrained on the VisDial dataset incrementally build representations that appropriately do scorekeeping.<n>Our conclusion is that the ability to make the distinction between shared and privately known statements along the dialogue is moderately present in the analysed models, but not always incrementally consistent.
arXiv Detail & Related papers (2022-04-14T13:52:11Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Who Responded to Whom: The Joint Effects of Latent Topics and Discourse
in Conversation Structure [53.77234444565652]
We identify the responding relations in the conversation discourse, which link response utterances to their initiations.
We propose a model to learn latent topics and discourse in word distributions, and predict pairwise initiation-response links.
Experimental results on both English and Chinese conversations show that our model significantly outperforms the previous state of the arts.
arXiv Detail & Related papers (2021-04-17T17:46:00Z) - Dialogue-Based Relation Extraction [53.2896545819799]
We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE.
We argue that speaker-related information plays a critical role in the proposed task, based on an analysis of similarities and differences between dialogue-based and traditional RE tasks.
Experimental results demonstrate that a speaker-aware extension on the best-performing model leads to gains in both the standard and conversational evaluation settings.
arXiv Detail & Related papers (2020-04-17T03:51:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.