Addressing Label Leakage in Knowledge Tracing Models
- URL: http://arxiv.org/abs/2403.15304v3
- Date: Mon, 07 Apr 2025 15:00:58 GMT
- Title: Addressing Label Leakage in Knowledge Tracing Models
- Authors: Yahya Badran, Christine Preisach,
- Abstract summary: We present methods to prevent label leakage in knowledge tracing (KT) models.<n>Our methods consistently outperform their original counterparts.<n> Notably, our methods are versatile and can be applied to a wide range of KT models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge Tracing (KT) is concerned with predicting students' future performance on learning items in intelligent tutoring systems. Learning items are tagged with skill labels called knowledge concepts (KCs). Many KT models expand the sequence of item-student interactions into KC-student interactions by replacing learning items with their constituting KCs. This approach addresses the issue of sparse item-student interactions and minimises the number of model parameters. However, we identified a label leakage problem with this approach. The model's ability to learn correlations between KCs belonging to the same item can result in the leakage of ground truth labels, which leads to decreased performance, particularly on datasets with a high number of KCs per item. In this paper, we present methods to prevent label leakage in knowledge tracing (KT) models. Our model variants that utilize these methods consistently outperform their original counterparts. This further underscores the impact of label leakage on model performance. Additionally, these methods enhance the overall performance of KT models, with one model variant surpassing all tested baselines on different benchmarks. Notably, our methods are versatile and can be applied to a wide range of KT models.
Related papers
- Sparse Binary Representation Learning for Knowledge Tracing [0.0]
Knowledge tracing (KT) models aim to predict students' future performance based on their historical interactions.
Most existing KT models rely exclusively on human-defined knowledge concepts associated with exercises.
We propose a KT model, Sparse Binary Representation KT (SBRKT), that generates new KC labels, referred to as auxiliary KCs.
arXiv Detail & Related papers (2025-01-17T00:45:10Z) - Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.
Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes.
We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z) - Fighting Spurious Correlations in Text Classification via a Causal Learning Perspective [2.7813683000222653]
We propose the Causally Calibrated Robust ( CCR) to reduce models' reliance on spurious correlations.
CCR integrates a causal feature selection method based on counterfactual reasoning, along with an inverse propensity weighting (IPW) loss function.
We show that CCR state-of-the-art performance among methods without group labels, and in some cases, it can compete with the models that utilize group labels.
arXiv Detail & Related papers (2024-11-01T21:29:07Z) - Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing [59.480951050911436]
We present KCQRL, a framework for automated knowledge concept annotation and question representation learning.
We demonstrate the effectiveness of KCQRL across 15 KT algorithms on two large real-world Math learning datasets.
arXiv Detail & Related papers (2024-10-02T16:37:19Z) - Towards Robust Knowledge Tracing Models via k-Sparse Attention [33.02197868261949]
textscsparseKT is a simple yet effective framework to improve the robustness and generalization of the attention based DLKT approaches.
We show that our textscsparseKT is able to help attentional KT models get rid of irrelevant student interactions.
arXiv Detail & Related papers (2024-07-24T08:49:18Z) - Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Channel-Wise Contrastive Learning for Learning with Noisy Labels [60.46434734808148]
We introduce channel-wise contrastive learning (CWCL) to distinguish authentic label information from noise.
Unlike conventional instance-wise contrastive learning (IWCL), CWCL tends to yield more nuanced and resilient features aligned with the authentic labels.
Our strategy is twofold: firstly, using CWCL to extract pertinent features to identify cleanly labeled samples, and secondly, progressively fine-tuning using these samples.
arXiv Detail & Related papers (2023-08-14T06:04:50Z) - Can Large Language Models Infer Causation from Correlation? [104.96351414570239]
We test the pure causal inference skills of large language models (LLMs)
We formulate a novel task Corr2Cause, which takes a set of correlational statements and determines the causal relationship between the variables.
We show that these models achieve almost close to random performance on the task.
arXiv Detail & Related papers (2023-06-09T12:09:15Z) - Enhancing Deep Knowledge Tracing with Auxiliary Tasks [24.780533765606922]
We propose emphAT-DKT to improve the prediction performance of the original deep knowledge tracing model.
We conduct comprehensive experiments on three real-world educational datasets and compare the proposed approach to both deep sequential KT models and non-sequential models.
arXiv Detail & Related papers (2023-02-14T08:21:37Z) - simpleKT: A Simple But Tough-to-Beat Baseline for Knowledge Tracing [22.055683237994696]
We provide a strong but simple baseline method to deal with the KT task named textscsimpleKT.
Inspired by the Rasch model in psychometrics, we explicitly model question-specific variations to capture the individual differences among questions.
We use the ordinary dot-product attention function to extract the time-aware information embedded in the student learning interactions.
arXiv Detail & Related papers (2023-02-14T08:09:09Z) - Assessing the Impact of Sequence Length Learning on Classification Tasks for Transformer Encoder Models [0.030693357740321774]
Classification algorithms can be affected by the sequence length learning problem whenever observations from different classes have a different length distribution.
This problem causes models to use sequence length as a predictive feature instead of relying on important textual information.
Although most public datasets are not affected by this problem, privately owned corpora for fields such as medicine and insurance may carry this data bias.
arXiv Detail & Related papers (2022-12-16T10:46:20Z) - Transductive CLIP with Class-Conditional Contrastive Learning [68.51078382124331]
We propose Transductive CLIP, a novel framework for learning a classification network with noisy labels from scratch.
A class-conditional contrastive learning mechanism is proposed to mitigate the reliance on pseudo labels.
ensemble labels is adopted as a pseudo label updating strategy to stabilize the training of deep neural networks with noisy labels.
arXiv Detail & Related papers (2022-06-13T14:04:57Z) - Deep Learning Models for Knowledge Tracing: Review and Empirical
Evaluation [2.423547527175807]
We review and evaluate a body of deep learning knowledge tracing (DLKT) models with openly available and widely-used data sets.
The evaluated DLKT models have been reimplemented for assessing and replicability of previously reported results.
arXiv Detail & Related papers (2021-12-30T14:19:27Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Learning Adaptive Embedding Considering Incremental Class [55.21855842960139]
Class-Incremental Learning (CIL) aims to train a reliable model with the streaming data, which emerges unknown classes sequentially.
Different from traditional closed set learning, CIL has two main challenges: 1) Novel class detection.
After the novel classes are detected, the model needs to be updated without re-training using entire previous data.
arXiv Detail & Related papers (2020-08-31T04:11:24Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z) - SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive
Connection [51.376723069962]
We present a method for accelerating and structuring self-attentions: Sparse Adaptive Connection.
In SAC, we regard the input sequence as a graph and attention operations are performed between linked nodes.
We show that SAC is competitive with state-of-the-art models while significantly reducing memory cost.
arXiv Detail & Related papers (2020-03-22T07:58:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.