Dude, where's my utterance? Evaluating the effects of automatic segmentation and transcription on CPS detection
- URL: http://arxiv.org/abs/2507.04454v1
- Date: Sun, 06 Jul 2025 16:25:18 GMT
- Title: Dude, where's my utterance? Evaluating the effects of automatic segmentation and transcription on CPS detection
- Authors: Videep Venkatesha, Mariah Bradford, Nathaniel Blanchard,
- Abstract summary: Collaborative Problem-Solving markers capture key aspects of effective teamwork.<n>An AI system that reliably detects these markers could help teachers identify when a group is struggling or demonstrating productive collaboration.<n>We evaluate how CPS detection is impacted by automating two critical components: transcription and speech segmentation.
- Score: 0.27309692684728604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaborative Problem-Solving (CPS) markers capture key aspects of effective teamwork, such as staying on task, avoiding interruptions, and generating constructive ideas. An AI system that reliably detects these markers could help teachers identify when a group is struggling or demonstrating productive collaboration. Such a system requires an automated pipeline composed of multiple components. In this work, we evaluate how CPS detection is impacted by automating two critical components: transcription and speech segmentation. On the public Weights Task Dataset (WTD), we find CPS detection performance with automated transcription and segmentation methods is comparable to human-segmented and manually transcribed data; however, we find the automated segmentation methods reduces the number of utterances by 26.5%, impacting the the granularity of the data. We discuss the implications for developing AI-driven tools that support collaborative learning in classrooms.
Related papers
- Leadership Assessment in Pediatric Intensive Care Unit Team Training [12.775569777482566]
This paper addresses the task of assessing PICU team's leadership skills by developing an automated analysis framework based on egocentric vision.<n>We identify key behavioral cues, including fixation object, eye contact, and conversation patterns, as essential indicators of leadership assessment.
arXiv Detail & Related papers (2025-05-30T09:19:33Z) - Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents [65.36060818857109]
We present a novel framework for extracting and evaluating dialog from historical interactions.<n>Our extraction process consists of two key stages: (1) a retrieval step to select relevant conversations based on key procedural elements, and (2) a structured workflow generation process using a question-answer-based chain-of-thought (QA-CoT) prompting.
arXiv Detail & Related papers (2025-02-24T16:55:15Z) - Transparent and Coherent Procedural Mistake Detection [27.40806437649092]
Procedural mistake detection (PMD) is a challenging problem of classifying whether a human user has successfully executed a task (specified by a procedural text)<n>We extend PMD to require generating visual self-dialog rationales to inform decisions.<n>Given the impressive, mature image understanding capabilities observed in recent vision-and-language models (VLMs), we curate a suitable benchmark dataset for PMD based on individual frames.<n>As our reformulation enables unprecedented transparency, we leverage a natural language inference (NLI) model to formulate two automated metrics for the coherence of generated rationales.
arXiv Detail & Related papers (2024-12-16T16:13:55Z) - Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.<n>Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes.<n>We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z) - Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation [50.60733773088296]
We conduct a comprehensive human evaluation of the results of several shared tasks from the last International Workshop on Spoken Language Translation (IWSLT 2023)
We propose an effective evaluation strategy based on automatic resegmentation and direct assessment with segment context.
Our analysis revealed that: 1) the proposed evaluation strategy is robust and scores well-correlated with other types of human judgements; 2) automatic metrics are usually, but not always, well-correlated with direct assessment scores; and 3) COMET as a slightly stronger automatic metric than chrF.
arXiv Detail & Related papers (2024-06-06T09:18:42Z) - A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images [0.0]
We propose a method that alleviates the need for manual labeling of laboratory rats.
The final system is capable of instance segmentation, keypoint detection, and body part segmentation even when the objects are heavily occluded.
arXiv Detail & Related papers (2024-05-07T20:11:07Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - How Good is Automatic Segmentation as a Multimodal Discourse Annotation
Aid? [3.3861948721202233]
We assess the quality of different utterance segmentation techniques as an aid in annotating Collaborative Problem Solving.
We show that the oracle utterances have minimal correspondence to automatically segmented speech, and that automatically segmented speech using different segmentation methods is also inconsistent.
arXiv Detail & Related papers (2023-05-27T03:06:15Z) - Adaptive Multi-scale Online Likelihood Network for AI-assisted
Interactive Segmentation [3.3909100561725127]
Existing interactive segmentation methods leverage automatic segmentation and user interactions for label refinement.
We propose an adaptive multi-scale online likelihood network (MONet) that adaptively learns in a data-efficient online setting.
Our approach achieved 5.86% higher Dice score with 24.67% less perceived NASA-TLX workload score than the state-of-the-art.
arXiv Detail & Related papers (2023-03-23T22:20:56Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - Transfer Learning for Autonomous Chatter Detection in Machining [0.9281671380673306]
Large-amplitude chatter vibrations are one of the most important phenomena in machining processes.
Three challenges can be identified in applying machine learning for chatter detection at large in industry.
These three challenges can be grouped under the umbrella of transfer learning.
arXiv Detail & Related papers (2022-04-11T20:46:06Z) - TraSeTR: Track-to-Segment Transformer with Contrastive Query for
Instance-level Instrument Segmentation in Robotic Surgery [60.439434751619736]
We propose TraSeTR, a Track-to-Segment Transformer that exploits tracking cues to assist surgical instrument segmentation.
TraSeTR jointly reasons about the instrument type, location, and identity with instance-level predictions.
The effectiveness of our method is demonstrated with state-of-the-art instrument type segmentation results on three public datasets.
arXiv Detail & Related papers (2022-02-17T05:52:18Z) - Dual Adversarial Auto-Encoders for Clustering [152.84443014554745]
We propose Dual Adversarial Auto-encoder (Dual-AAE) for unsupervised clustering.
By performing variational inference on the objective function of Dual-AAE, we derive a new reconstruction loss which can be optimized by training a pair of Auto-encoders.
Experiments on four benchmarks show that Dual-AAE achieves superior performance over state-of-the-art clustering methods.
arXiv Detail & Related papers (2020-08-23T13:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.