Utilizing Natural Language Processing for Automated Assessment of
Classroom Discussion
- URL: http://arxiv.org/abs/2306.14918v1
- Date: Wed, 21 Jun 2023 16:45:24 GMT
- Title: Utilizing Natural Language Processing for Automated Assessment of
Classroom Discussion
- Authors: Nhat Tran, Benjamin Pierce, Diane Litman, Richard Correnti and Lindsay
Clare Matsumura
- Abstract summary: In this work, we experimented with various modern natural language processing (NLP) techniques to automatically generate rubric scores for individual dimensions of classroom text discussion quality.
Despite the limited amount of data, our work shows encouraging results in some of the rubrics while suggesting that there is room for improvement in the others.
- Score: 0.7087237546722617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Rigorous and interactive class discussions that support students to engage in
high-level thinking and reasoning are essential to learning and are a central
component of most teaching interventions. However, formally assessing
discussion quality 'at scale' is expensive and infeasible for most researchers.
In this work, we experimented with various modern natural language processing
(NLP) techniques to automatically generate rubric scores for individual
dimensions of classroom text discussion quality. Specifically, we worked on a
dataset of 90 classroom discussion transcripts consisting of over 18000 turns
annotated with fine-grained Analyzing Teaching Moves (ATM) codes and focused on
four Instructional Quality Assessment (IQA) rubrics. Despite the limited amount
of data, our work shows encouraging results in some of the rubrics while
suggesting that there is room for improvement in the others. We also found that
certain NLP approaches work better for certain rubrics.
Related papers
- Automated Assessment of Encouragement and Warmth in Classrooms Leveraging Multimodal Emotional Features and ChatGPT [7.273857543125784]
Our work explores a multimodal approach to automatically estimating encouragement and warmth in classrooms.
We employed facial and speech emotion recognition with sentiment analysis to extract interpretable features from video, audio, and transcript data.
We demonstrated our approach on the GTI dataset, comprising 367 16-minute video segments from 92 authentic lesson recordings.
arXiv Detail & Related papers (2024-04-01T16:58:09Z) - Teamwork Dimensions Classification Using BERT [0.8566457170664924]
An automated natural language processing approach was developed to identify teamwork dimensions of students' online team chat.
Developments in the field of natural language processing and artificial intelligence have resulted in advanced deep transfer learning approaches.
This model will contribute towards an enhanced learning analytics tool for teamwork assessment and feedback.
arXiv Detail & Related papers (2023-12-09T07:18:41Z) - Learning and Evaluating Human Preferences for Conversational Head
Generation [101.89332968344102]
We propose a novel learning-based evaluation metric named Preference Score (PS) for fitting human preference according to the quantitative evaluations across different dimensions.
PS can serve as a quantitative evaluation without the need for human annotation.
arXiv Detail & Related papers (2023-07-20T07:04:16Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Deep Learning for Opinion Mining and Topic Classification of Course
Reviews [0.0]
We collected and pre-processed a large number of course reviews publicly available online.
We applied machine learning techniques with the goal to gain insight into student sentiments and topics.
For sentiment polarity, the top model was RoBERTa with 95.5% accuracy and 84.7% F1-macro, while for topic classification, an SVM was the top with 79.8% accuracy and 80.6% F1-macro.
arXiv Detail & Related papers (2023-04-06T21:48:29Z) - The Conversational Short-phrase Speaker Diarization (CSSD) Task:
Dataset, Evaluation Metric and Baselines [63.86406909879314]
This paper describes the Conversational Short-phrases Speaker Diarization (CSSD) task.
It consists of training and testing datasets, evaluation metric and baselines.
In the metric aspect, we design the new conversational DER (CDER) evaluation metric, which calculates the SD accuracy at the utterance level.
arXiv Detail & Related papers (2022-08-17T03:26:23Z) - Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring [60.55025339250815]
We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling.
We take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context from these responses and feed them as additional speaker-specific context to our network to score a particular response.
arXiv Detail & Related papers (2021-08-30T07:00:28Z) - Using Machine Learning and Natural Language Processing Techniques to
Analyze and Support Moderation of Student Book Discussions [0.0]
The IMapBook project aims at improving the literacy and reading comprehension skills of elementary school-aged children by presenting them with interactive e-books and letting them take part in moderated book discussions.
This study aims to develop and illustrate a machine learning-based approach to message classification that could be used to automatically notify the discussion moderator of a possible need for an intervention and also to collect other useful information about the ongoing discussion.
arXiv Detail & Related papers (2020-11-23T20:33:09Z) - Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data
and Methodology [68.8836704199096]
Corpus-based conversational interfaces are able to generate more diverse and natural responses than template-based or retrieval-based agents.
With their increased generative capacity of corpusbased conversational agents comes the need to classify and filter out malevolent responses.
Previous studies on the topic of recognizing and classifying inappropriate content are mostly focused on a certain category of malevolence.
arXiv Detail & Related papers (2020-08-21T22:43:27Z) - Neural Multi-Task Learning for Teacher Question Detection in Online
Classrooms [50.19997675066203]
We build an end-to-end neural framework that automatically detects questions from teachers' audio recordings.
By incorporating multi-task learning techniques, we are able to strengthen the understanding of semantic relations among different types of questions.
arXiv Detail & Related papers (2020-05-16T02:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.