MonaCoBERT: Monotonic attention based ConvBERT for Knowledge Tracing
- URL: http://arxiv.org/abs/2208.12615v2
- Date: Wed, 7 Sep 2022 09:46:21 GMT
- Title: MonaCoBERT: Monotonic attention based ConvBERT for Knowledge Tracing
- Authors: Unggi Lee, Yonghyun Park, Yujin Kim, Seongyune Choi, Hyeoncheol Kim
- Abstract summary: Knowledge tracing is a field of study that predicts the future performance of students based on prior performance datasets.
MonaCoBERT achieves the best performance on most benchmark datasets and has significant interpretability.
- Score: 3.187381965457262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge tracing (KT) is a field of study that predicts the future
performance of students based on prior performance datasets collected from
educational applications such as intelligent tutoring systems, learning
management systems, and online courses. Some previous studies on KT have
concentrated only on the interpretability of the model, whereas others have
focused on enhancing the performance. Models that consider both
interpretability and the performance improvement have been insufficient.
Moreover, models that focus on performance improvements have not shown an
overwhelming performance compared with existing models. In this study, we
propose MonaCoBERT, which achieves the best performance on most benchmark
datasets and has significant interpretability. MonaCoBERT uses a BERT-based
architecture with monotonic convolutional multihead attention, which reflects
forgetting behavior of the students and increases the representation power of
the model. We can also increase the performance and interpretability using a
classical test-theory-based (CTT-based) embedding strategy that considers the
difficulty of the question. To determine why MonaCoBERT achieved the best
performance and interpret the results quantitatively, we conducted ablation
studies and additional analyses using Grad-CAM, UMAP, and various visualization
techniques. The analysis results demonstrate that both attention components
complement one another and that CTT-based embedding represents information on
both global and local difficulties. We also demonstrate that our model
represents the relationship between concepts.
Related papers
- A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning [75.68193159293425]
In-context learning (ICL) allows transformer-based language models to learn a specific task with a few "task demonstrations" without updating their parameters.
We propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL.
We experimentally prove the wide applicability of DETAIL by showing our attribution scores obtained on white-box models are transferable to black-box models in improving model performance.
arXiv Detail & Related papers (2024-05-22T15:52:52Z) - Enhancing Fairness and Performance in Machine Learning Models: A Multi-Task Learning Approach with Monte-Carlo Dropout and Pareto Optimality [1.5498930424110338]
This study introduces an approach to mitigate bias in machine learning by leveraging model uncertainty.
Our approach utilizes a multi-task learning (MTL) framework combined with Monte Carlo (MC) Dropout to assess and mitigate uncertainty in predictions related to protected labels.
arXiv Detail & Related papers (2024-04-12T04:17:50Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Feeding What You Need by Understanding What You Learned [54.400455868448695]
Machine Reading (MRC) reveals the ability to understand a given text passage and answer questions based on it.
Existing research works in MRC rely heavily on large-size models and corpus to improve the performance evaluated by metrics such as Exact Match.
We argue that a deep understanding of model capabilities and data properties can help us feed a model with appropriate training data.
arXiv Detail & Related papers (2022-03-05T14:15:59Z) - Model Embedding Model-Based Reinforcement Learning [4.566180616886624]
Model-based reinforcement learning (MBRL) has shown its advantages in sample-efficiency over model-free reinforcement learning (MFRL)
Despite the impressive results it achieves, it still faces a trade-off between the ease of data generation and model bias.
We propose a simple and elegant model-embedding model-based reinforcement learning (MEMB) algorithm in the framework of the probabilistic reinforcement learning.
arXiv Detail & Related papers (2020-06-16T15:10:28Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.