Assessment Modeling: Fundamental Pre-training Tasks for Interactive
Educational Systems
- URL: http://arxiv.org/abs/2002.05505v6
- Date: Mon, 28 Jun 2021 05:00:25 GMT
- Title: Assessment Modeling: Fundamental Pre-training Tasks for Interactive
Educational Systems
- Authors: Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Dongmin Shin,
Hangyeol Yu, Yugeun Shim, Seewoo Lee, Jonghun Shin, Chan Bae, Byungsoo Kim,
Jaewe Heo
- Abstract summary: A common way of circumventing label-scarce problems is pre-training a model to learn representations of the contents of learning items.
We propose Assessment Modeling, a class of fundamental pre-training tasks for general interactive educational systems.
- Score: 3.269851859258154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Like many other domains in Artificial Intelligence (AI), there are specific
tasks in the field of AI in Education (AIEd) for which labels are scarce and
expensive, such as predicting exam score or review correctness. A common way of
circumventing label-scarce problems is pre-training a model to learn
representations of the contents of learning items. However, such methods fail
to utilize the full range of student interaction data available and do not
model student learning behavior. To this end, we propose Assessment Modeling, a
class of fundamental pre-training tasks for general interactive educational
systems. An assessment is a feature of student-system interactions which can
serve as a pedagogical evaluation. Examples include the correctness and
timeliness of a student's answer. Assessment Modeling is the prediction of
assessments conditioned on the surrounding context of interactions. Although it
is natural to pre-train on interactive features available in large amounts,
limiting the prediction targets to assessments focuses the tasks' relevance to
the label-scarce educational problems and reduces less-relevant noise. While
the effectiveness of different combinations of assessments is open for
exploration, we suggest Assessment Modeling as a first-order guiding principle
for selecting proper pre-training tasks for label-scarce educational problems.
Related papers
- Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling [31.696222064667243]
Action Quality Assessment (AQA) is a task that tries to answer how well an action is carried out.
Existing works on AQA assume that all the training data are visible for training at one time, but do not enable continual learning.
We propose a unified model to learn AQA tasks sequentially without forgetting.
arXiv Detail & Related papers (2023-09-29T10:06:28Z) - MoMA: Momentum Contrastive Learning with Multi-head Attention-based
Knowledge Distillation for Histopathology Image Analysis [5.396167537615578]
A lack of quality data is a common issue when it comes to a specific task in computational pathology.
We propose to exploit knowledge distillation, i.e., utilize the existing model to learn a new, target model.
We employ a student-teacher framework to learn a target model from a pre-trained, teacher model without direct access to source data.
arXiv Detail & Related papers (2023-08-31T08:54:59Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach [40.06500618820166]
This paper presents an approach to deriving a learner model directly from an assessment rubric.
We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills.
arXiv Detail & Related papers (2022-09-07T10:09:12Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy
Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning.
ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation.
Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - A framework for predicting, interpreting, and improving Learning
Outcomes [0.0]
We develop an Embibe Score Quotient model (ESQ) to predict test scores based on observed academic, behavioral and test-taking features of a student.
ESQ can be used to predict the future scoring potential of a student as well as offer personalized learning nudges.
arXiv Detail & Related papers (2020-10-06T11:22:27Z) - Neural Multi-Task Learning for Teacher Question Detection in Online
Classrooms [50.19997675066203]
We build an end-to-end neural framework that automatically detects questions from teachers' audio recordings.
By incorporating multi-task learning techniques, we are able to strengthen the understanding of semantic relations among different types of questions.
arXiv Detail & Related papers (2020-05-16T02:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.