Exploring Human-AI Complementarity in CPS Diagnosis Using Unimodal and Multimodal BERT Models
- URL: http://arxiv.org/abs/2507.14579v1
- Date: Sat, 19 Jul 2025 11:47:08 GMT
- Title: Exploring Human-AI Complementarity in CPS Diagnosis Using Unimodal and Multimodal BERT Models
- Authors: Kester Wong, Sahan Bulathwela, Mutlu Cukurova,
- Abstract summary: This paper extends previous research by highlighting that the AudiBERT model improved the classification of classes that were sparse in the dataset.<n>Similar significant class-wise improvements over the BERT model were not observed for classifications in the affective dimension.<n>A correlation analysis highlighted that larger training data was significantly associated with higher recall performance for both the AudiBERT and BERT models.
- Score: 5.1126582076480505
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Detecting collaborative problem solving (CPS) indicators from dialogue using machine learning techniques is a significant challenge for the field of AI in Education. Recent studies have explored the use of Bidirectional Encoder Representations from Transformers (BERT) models on transcription data to reliably detect meaningful CPS indicators. A notable advancement involved the multimodal BERT variant, AudiBERT, which integrates speech and acoustic-prosodic audio features to enhance CPS diagnosis. Although initial results demonstrated multimodal improvements, the statistical significance of these enhancements remained unclear, and there was insufficient guidance on leveraging human-AI complementarity for CPS diagnosis tasks. This workshop paper extends the previous research by highlighting that the AudiBERT model not only improved the classification of classes that were sparse in the dataset, but it also had statistically significant class-wise improvements over the BERT model for classifications in the social-cognitive dimension. However, similar significant class-wise improvements over the BERT model were not observed for classifications in the affective dimension. A correlation analysis highlighted that larger training data was significantly associated with higher recall performance for both the AudiBERT and BERT models. Additionally, the precision of the BERT model was significantly associated with high inter-rater agreement among human coders. When employing the BERT model to diagnose indicators within these subskills that were well-detected by the AudiBERT model, the performance across all indicators was inconsistent. We conclude the paper by outlining a structured approach towards achieving human-AI complementarity for CPS diagnosis, highlighting the crucial inclusion of model explainability to support human agency and engagement in the reflective coding process.
Related papers
- Explainable Collaborative Problem Solving Diagnosis with BERT using SHAP and its Implications for Teacher Adoption [5.1126582076480505]
This study examines how different tokenised words in transcription data contributed to a BERT model's classification of CPS processes.<n>The findings suggest that well-performing classifications did not equate to a reasonable explanation for the classification decisions.<n>The analysis also identified a spurious word, which contributed positively to the classification but was not semantically meaningful to the class.
arXiv Detail & Related papers (2025-07-19T11:57:24Z) - Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z) - Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models [0.562479170374811]
multimodal data and advanced models are argued to have the potential to detect complex CPS behaviours.<n>We investigated the potential of multimodal data to improve model performance in diagnosing 78 secondary school students' CPS subskills and indicators.
arXiv Detail & Related papers (2025-04-21T13:25:55Z) - CDS: Knowledge Component-Driven Data Synthesis Guided by Cognitive Diagnosis Theory [39.579188324839386]
Large Language Models (LLMs) have achieved significant advancements, but the increasing complexity of tasks and higher performance demands highlight the need for continuous improvement.<n>Some approaches utilize synthetic data generated by advanced LLMs based on evaluation results to train models.<n>In this paper, we introduce the Cognitive Diagnostic Synthesis (CDS) method, which incorporates a diagnostic process inspired by Cognitive Diagnosis Theory (CDT) to refine evaluation results and characterize model profiles at the knowledge component level.
arXiv Detail & Related papers (2025-01-13T20:13:59Z) - Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners [10.088785685439134]
We propose D-BETA, a framework that pre-trains ECG and text data using a contrastive masked auto-encoder architecture.<n>D-BETA uniquely combines the strengths of generative with boosted discriminative capabilities to achieve robust cross-modal representations.
arXiv Detail & Related papers (2024-10-03T01:24:09Z) - PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - Unmasking Dementia Detection by Masking Input Gradients: A JSM Approach
to Model Interpretability and Precision [1.5501208213584152]
We introduce an interpretable, multimodal model for Alzheimer's disease (AD) classification over its multi-stage progression, incorporating Jacobian Saliency Map (JSM) as a modality-agnostic tool.
Our evaluation including ablation study manifests the efficacy of using JSM for model debug and interpretation, while significantly enhancing model accuracy as well.
arXiv Detail & Related papers (2024-02-25T06:53:35Z) - Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning [5.438725298163702]
Self-Supervised Learning (SSL) contrastive learning has shown promise in mitigating the issue of data scarcity.<n>Our research aims to explore and evaluate a wide range of audio-based augmentations and uncover combinations that enhance SSL model performance in PCG classification.
arXiv Detail & Related papers (2023-12-01T11:06:00Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Discover, Explanation, Improvement: An Automatic Slice Detection
Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints.
This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks.
Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z) - Exploring linguistic feature and model combination for speech
recognition based automatic AD detection [61.91708957996086]
Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques.
Scarcity of specialist data leads to uncertainty in both model selection and feature learning when developing such systems.
This paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders.
arXiv Detail & Related papers (2022-06-28T05:09:01Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - To BERT or Not To BERT: Comparing Speech and Language-based Approaches
for Alzheimer's Disease Detection [17.99855227184379]
Natural language processing and machine learning provide promising techniques for reliably detecting Alzheimer's disease (AD)
We compare and contrast the performance of two such approaches for AD detection on the recent ADReSS challenge dataset.
We observe that fine-tuned BERT models, given the relative importance of linguistics in cognitive impairment detection, outperform feature-based approaches on the AD detection task.
arXiv Detail & Related papers (2020-07-26T04:50:47Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.