Forgetting-aware Linear Bias for Attentive Knowledge Tracing
- URL: http://arxiv.org/abs/2309.14796v1
- Date: Tue, 26 Sep 2023 09:48:30 GMT
- Title: Forgetting-aware Linear Bias for Attentive Knowledge Tracing
- Authors: Yoonjin Im, Eunseong Choi, Heejin Kook, Jongwuk Lee
- Abstract summary: This paper proposes Forgetting-aware Linear Bias (FoLiBi) to reflect forgetting behavior as a linear bias.
FoLiBi plugged with several KT models yields a consistent improvement of up to 2.58% in AUC over state-of-the-art KT models on four benchmark datasets.
- Score: 7.87348193562399
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Knowledge Tracing (KT) aims to track proficiency based on a question-solving
history, allowing us to offer a streamlined curriculum. Recent studies actively
utilize attention-based mechanisms to capture the correlation between questions
and combine it with the learner's characteristics for responses. However, our
empirical study shows that existing attention-based KT models neglect the
learner's forgetting behavior, especially as the interaction history becomes
longer. This problem arises from the bias that overprioritizes the correlation
of questions while inadvertently ignoring the impact of forgetting behavior.
This paper proposes a simple-yet-effective solution, namely Forgetting-aware
Linear Bias (FoLiBi), to reflect forgetting behavior as a linear bias. Despite
its simplicity, FoLiBi is readily equipped with existing attentive KT models by
effectively decomposing question correlations with forgetting behavior. FoLiBi
plugged with several KT models yields a consistent improvement of up to 2.58%
in AUC over state-of-the-art KT models on four benchmark datasets.
Related papers
- SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model [64.92472567841105]
Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question.
Structure-aware Inductive Knowledge Tracing model with large language model (dubbed SINKT)
SINKT predicts the student's response to the target question by interacting with the student's knowledge state and the question representation.
arXiv Detail & Related papers (2024-07-01T12:44:52Z) - Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals [91.59906995214209]
We propose a new evaluation method, Counterfactual Attentiveness Test (CAT)
CAT uses counterfactuals by replacing part of the input with its counterpart from a different example, expecting an attentive model to change its prediction.
We show that GPT3 becomes less attentive with an increased number of demonstrations, while its accuracy on the test data improves.
arXiv Detail & Related papers (2023-11-16T06:27:35Z) - FACTS: First Amplify Correlations and Then Slice to Discover Bias [17.244153084361102]
Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes.
Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold.
We propose First Amplify Correlations and Then Slice to Discover Bias to inform downstream bias mitigation strategies.
arXiv Detail & Related papers (2023-09-29T17:41:26Z) - Do We Fully Understand Students' Knowledge States? Identifying and
Mitigating Answer Bias in Knowledge Tracing [12.31363929361146]
Knowledge tracing aims to monitor students' evolving knowledge states through their learning interactions with concept-related questions.
There is a common phenomenon of answer bias, i.e., a highly unbalanced distribution of correct and incorrect answers for each question.
Existing models tend to memorize the answer bias as a shortcut for achieving high prediction performance in KT.
arXiv Detail & Related papers (2023-08-15T13:56:29Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Continual Learning in the Presence of Spurious Correlation [23.999136417157597]
We show that standard continual learning algorithms can transfer biases from one task to another, both forward and backward.
We propose a plug-in method for debiasing-aware continual learning, dubbed as Group-class Balanced Greedy Sampling (BGS)
arXiv Detail & Related papers (2023-03-21T14:06:12Z) - ACP++: Action Co-occurrence Priors for Human-Object Interaction
Detection [102.9428507180728]
A common problem in the task of human-object interaction (HOI) detection is that numerous HOI classes have only a small number of labeled examples.
We observe that there exist natural correlations and anti-correlations among human-object interactions.
We present techniques to learn these priors and leverage them for more effective training, especially on rare classes.
arXiv Detail & Related papers (2021-09-09T06:02:50Z) - Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection.
We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn.
We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z) - GIKT: A Graph-based Interaction Model for Knowledge Tracing [36.07642261246016]
We propose a Graph-based Interaction model for Knowledge Tracing (GIKT) to tackle the above probems.
More specifically, GIKT utilizes graph convolutional network (GCN) to substantially incorporate question-skill correlations.
Experiments on three datasets demonstrate that GIKT achieves the new state-of-the-art performance, with at least 1% absolute AUC improvement.
arXiv Detail & Related papers (2020-09-13T12:50:32Z) - Context-Aware Attentive Knowledge Tracing [21.397976659857793]
We propose attentive knowledge tracing, which couples flexible attention-based neural network models with a series of novel, interpretable model components.
AKT uses a novel monotonic attention mechanism that relates a learner's future responses to assessment questions to their past responses.
We show that AKT outperforms existing KT methods (by up to $6%$ in AUC in some cases) on predicting future learner responses.
arXiv Detail & Related papers (2020-07-24T02:45:43Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.