Turn-taking annotation for quantitative and qualitative analyses of conversation
- URL: http://arxiv.org/abs/2504.09980v1
- Date: Mon, 14 Apr 2025 08:45:04 GMT
- Title: Turn-taking annotation for quantitative and qualitative analyses of conversation
- Authors: Anneliese Kelterer, Barbara Schuppler,
- Abstract summary: Turn-taking was annotated on two layers, Inter-Pausal Units (IPU) and points of potential completion (PCOMP); similar to transition relevance places.<n>A detailed analysis of inter-rater agreement and common confusions shows that agreement for IPU annotation is near-perfect.<n>The system can be applied to a variety of conversational data for linguistic studies and technological applications.
- Score: 5.425050980601873
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper has two goals. First, we present the turn-taking annotation layers created for 95 minutes of conversational speech of the Graz Corpus of Read and Spontaneous Speech (GRASS), available to the scientific community. Second, we describe the annotation system and the annotation process in more detail, so other researchers may use it for their own conversational data. The annotation system was developed with an interdisciplinary application in mind. It should be based on sequential criteria according to Conversation Analysis, suitable for subsequent phonetic analysis, thus time-aligned annotations were made Praat, and it should be suitable for automatic classification, which required the continuous annotation of speech and a label inventory that is not too large and results in a high inter-rater agreement. Turn-taking was annotated on two layers, Inter-Pausal Units (IPU) and points of potential completion (PCOMP; similar to transition relevance places). We provide a detailed description of the annotation process and of segmentation and labelling criteria. A detailed analysis of inter-rater agreement and common confusions shows that agreement for IPU annotation is near-perfect, that agreement for PCOMP annotations is substantial, and that disagreements often are either partial or can be explained by a different analysis of a sequence which also has merit. The annotation system can be applied to a variety of conversational data for linguistic studies and technological applications, and we hope that the annotations, as well as the annotation system will contribute to a stronger cross-fertilization between these disciplines.
Related papers
- MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines [11.037522635949939]
We introduce MockConf, a student interpreting dataset that was collected from Mock Conferences run as part of the students' curriculum.<n>This dataset contains 7 hours of recordings in 5 European languages, transcribed and aligned at the level of spans and words.<n>We further implement and release InterAlign, a modern web-based annotation tool for parallel word and span annotations on long inputs, suitable for aligning simultaneous interpreting.
arXiv Detail & Related papers (2025-06-05T10:16:15Z) - Automated Essay Scoring Incorporating Annotations from Automated Feedback Systems [0.0]
We integrate two types of feedback-driven annotations: those that identify spelling and grammatical errors, and those that highlight argumentative components.<n>To illustrate how this method could be applied in real-world scenarios, we employ two LLMs to generate annotations.
arXiv Detail & Related papers (2025-05-28T18:39:56Z) - Aspect-based Meeting Transcript Summarization: A Two-Stage Approach with
Weak Supervision on Sentence Classification [91.13086984529706]
Aspect-based meeting transcript summarization aims to produce multiple summaries.
Traditional summarization methods produce one summary mixing information of all aspects.
We propose a two-stage method for aspect-based meeting transcript summarization.
arXiv Detail & Related papers (2023-11-07T19:06:31Z) - How Good is Automatic Segmentation as a Multimodal Discourse Annotation
Aid? [3.3861948721202233]
We assess the quality of different utterance segmentation techniques as an aid in annotating Collaborative Problem Solving.
We show that the oracle utterances have minimal correspondence to automatically segmented speech, and that automatically segmented speech using different segmentation methods is also inconsistent.
arXiv Detail & Related papers (2023-05-27T03:06:15Z) - SNaC: Coherence Error Detection for Narrative Summarization [73.48220043216087]
We introduce SNaC, a narrative coherence evaluation framework rooted in fine-grained annotations for long summaries.
We develop a taxonomy of coherence errors in generated narrative summaries and collect span-level annotations for 6.6k sentences across 150 book and movie screenplay summaries.
Our work provides the first characterization of coherence errors generated by state-of-the-art summarization models and a protocol for eliciting coherence judgments from crowd annotators.
arXiv Detail & Related papers (2022-05-19T16:01:47Z) - Sentence Embeddings and High-speed Similarity Search for Fast Computer
Assisted Annotation of Legal Documents [0.5249805590164901]
We introduce a proof-of-concept system for annotating sentences "laterally"
The approach is based on the observation that sentences that are similar in meaning often have the same label in terms of a particular type system.
arXiv Detail & Related papers (2021-12-21T19:27:21Z) - Annotation Curricula to Implicitly Train Non-Expert Annotators [56.67768938052715]
voluntary studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.
This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations.
We propose annotation curricula, a novel approach to implicitly train annotators.
arXiv Detail & Related papers (2021-06-04T09:48:28Z) - Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and
Context-Aware Auto-Encoders [59.038157066874255]
We propose a novel framework called RankAE to perform chat summarization without employing manually labeled data.
RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously.
A denoising auto-encoder is designed to generate succinct but context-informative summaries based on the selected utterances.
arXiv Detail & Related papers (2020-12-14T07:31:17Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Predicting the Humorousness of Tweets Using Gaussian Process Preference
Learning [56.18809963342249]
We present a probabilistic approach that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations.
We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF 2019 data and the pairwise judgment annotations required for our method.
arXiv Detail & Related papers (2020-08-03T13:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.