SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus
- URL: http://arxiv.org/abs/2411.12844v1
- Date: Tue, 19 Nov 2024 20:18:55 GMT
- Title: SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus
- Authors: Stephanie M. Lukin, Claire Bonial, Matthew Marge, Taylor Hudson, Cory J. Hayes, Kimberly A. Pollard, Anthony Baker, Ashley N. Foots, Ron Artstein, Felix Gervits, Mitchell Abrams, Cassidy Henry, Lucia Donatelli, Anton Leuski, Susan G. Hill, David Traum, Clare R. Voss,
- Abstract summary: We introduce the Situated Corpus Of Understanding Transactions (SCOUT)
It is a collection of human-robot dialogue in the task domain of collaborative exploration.
SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue.
- Score: 5.063252743855561
- License:
- Abstract: We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue. The dialogues are aligned with the multi-modal data streams available during the experiments: 5,785 images and 30 maps. The corpus has been annotated with Abstract Meaning Representation and Dialogue-AMR to identify the speaker's intent and meaning within an utterance, and with Transactional Units and Relations to track relationships between utterances to reveal patterns of the Dialogue Structure. We describe how the corpus and its annotations have been used to develop autonomous human-robot systems and enable research in open questions of how humans speak to robots. We release this corpus to accelerate progress in autonomous, situated, human-robot dialogue, especially in the context of navigation tasks where details about the environment need to be discovered.
Related papers
- Human-Robot Dialogue Annotation for Multi-Modal Common Ground [4.665414514091581]
We describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners.
A particular challenge for establishing common ground arises in remote dialogue, where a human and robot are engaged in a joint navigation and exploration task of an unfamiliar environment, but where the robot cannot immediately share high quality visual information due to limited communication constraints.
Within this paradigm, we capture propositional semantics and the illocutionary force of a single utterance within the dialogue through our Dialogue-AMR annotation, an augmentation of Abstract Meaning Representation
arXiv Detail & Related papers (2024-11-19T19:33:54Z) - J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling [43.87842102048749]
Spoken dialogue plays a crucial role in human-AI interactions, necessitating dialogue-oriented spoken language models (SLMs)
To ensure hiqh-quality speech generation, the data must be spontaneous like in-wild data and must be acoustically clean with noise removed.
This study addresses this gap by constructing and releasing a large-scale spoken dialogue corpus, named Japanese Corpus for Human-AI Talks (J-CHAT)
This paper presents a language-independent method for corpus construction and describes experiments on dialogue generation using SLMs trained on J-CHAT.
arXiv Detail & Related papers (2024-07-22T17:46:50Z) - LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
We propose a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction.
Our method can simulate human-chatbot dialogues with a high indistinguishability rate.
arXiv Detail & Related papers (2024-07-04T14:49:46Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - DialogueBERT: A Self-Supervised Learning based Dialogue Pre-training
Encoder [19.51263716065853]
We propose a novel contextual dialogue encoder (i.e. DialogueBERT) based on the popular pre-trained language model BERT.
Five self-supervised learning pre-training tasks are devised for learning the particularity of dialouge utterances.
DialogueBERT was pre-trained with 70 million dialogues in real scenario, and then fine-tuned in three different downstream dialogue understanding tasks.
arXiv Detail & Related papers (2021-09-22T01:41:28Z) - EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion in
Task-Oriented Dialogue Systems [3.3010169113961325]
EmoWOZ is a large-scale manually emotion-annotated corpus of task-oriented dialogues.
It contains more than 11K dialogues with more than 83K emotion annotations of user utterances.
We propose a novel emotion labelling scheme, which is tailored to task-oriented dialogues.
arXiv Detail & Related papers (2021-09-10T15:00:01Z) - Multi-View Sequence-to-Sequence Models with Conversational Structure for
Abstractive Dialogue Summarization [72.54873655114844]
Text summarization is one of the most challenging and interesting problems in NLP.
This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations.
Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
arXiv Detail & Related papers (2020-10-04T20:12:44Z) - Structured Attention for Unsupervised Dialogue Structure Induction [110.12561786644122]
We propose to incorporate structured attention layers into a Variational Recurrent Neural Network (VRNN) model with discrete latent states to learn dialogue structure in an unsupervised fashion.
Compared to a vanilla VRNN, structured attention enables a model to focus on different parts of the source sentence embeddings while enforcing a structural inductive bias.
arXiv Detail & Related papers (2020-09-17T23:07:03Z) - Is this Dialogue Coherent? Learning from Dialogue Acts and Entities [82.44143808977209]
We create the Switchboard Coherence (SWBD-Coh) corpus, a dataset of human-human spoken dialogues annotated with turn coherence ratings.
Our statistical analysis of the corpus indicates how turn coherence perception is affected by patterns of distribution of entities.
We find that models combining both DA and entity information yield the best performances both for response selection and turn coherence rating.
arXiv Detail & Related papers (2020-06-17T21:02:40Z) - Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue
Representation Learning [50.5572111079898]
Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc.
While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive.
In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks.
arXiv Detail & Related papers (2020-02-27T04:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.