RouterKT: Mixture-of-Experts for Knowledge Tracing
- URL: http://arxiv.org/abs/2504.08989v2
- Date: Tue, 22 Apr 2025 06:40:45 GMT
- Title: RouterKT: Mixture-of-Experts for Knowledge Tracing
- Authors: Han Liao, Shuaishuai Zu,
- Abstract summary: Knowledge Tracing (KT) is a fundamental task in Intelligent Tutoring Systems (ITS)<n>We propose RouterKT, a novel Mixture-of-Experts architecture designed to capture heterogeneous learning patterns.<n>We show that RouterKT exhibits significant flexibility and improves the performance of various KT backbone models.
- Score: 1.983472984641239
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Tracing (KT) is a fundamental task in Intelligent Tutoring Systems (ITS), which aims to model the dynamic knowledge states of students based on their interaction histories. However, existing KT models often rely on a global forgetting decay mechanism for capturing learning patterns, assuming that students' performance is predominantly influenced by their most recent interactions. Such approaches fail to account for the diverse and complex learning patterns arising from individual differences and varying learning stages. To address this limitation, we propose RouterKT, a novel Mixture-of-Experts (MoE) architecture designed to capture heterogeneous learning patterns by enabling experts to specialize in different patterns without any handcrafted learning pattern bias such as forgetting decay. Specifically, RouterKT introduces a \textbf{person-wise routing mechanism} to effectively model individual-specific learning behaviors and employs \textbf{multi-heads as experts} to enhance the modeling of complex and diverse patterns. Comprehensive experiments on ten benchmark datasets demonstrate that RouterKT exhibits significant flexibility and improves the performance of various KT backbone models, with a maximum average AUC improvement of 3.29\% across different backbones and datasets, outperforming other state-of-the-art models. Moreover, RouterKT demonstrates consistently superior inference efficiency compared to existing approaches based on handcrafted learning pattern bias, highlighting its usability for real-world educational applications. The source code is available at https://github.com/ringotc/RouterKT.git.
Related papers
- TabKAN: Advancing Tabular Data Analysis using Kolmograv-Arnold Network [11.664880068737084]
This paper introduces TabKAN, a novel framework that advances tabular data modeling using Kolmogorov-Arnold Networks (KANs)<n>KANs leverage learnable activation functions on edges, enhancing both interpretability and training efficiency.<n>Through extensive benchmarking on diverse public datasets, TabKAN demonstrates superior performance in supervised learning while significantly outperforming classical and Transformer-based models in transfer learning scenarios.
arXiv Detail & Related papers (2025-04-09T03:46:10Z) - AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.<n>Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.<n>We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z) - Robust Asymmetric Heterogeneous Federated Learning with Corrupted Clients [60.22876915395139]
This paper studies a challenging robust federated learning task with model heterogeneous and data corrupted clients.<n>Data corruption is unavoidable due to factors such as random noise, compression artifacts, or environmental conditions in real-world deployment.<n>We propose a novel Robust Asymmetric Heterogeneous Federated Learning framework to address these issues.
arXiv Detail & Related papers (2025-03-12T09:52:04Z) - Shortcut Learning Susceptibility in Vision Classifiers [3.004632712148892]
Shortcut learning is where machine learning models exploit spurious correlations in data instead of capturing meaningful features.<n>This phenomenon is prevalent across various machine learning applications, including vision, natural language processing, and speech recognition.<n>We systematically evaluate these architectures by introducing deliberate shortcuts into the dataset that are positionally correlated with class labels.
arXiv Detail & Related papers (2025-02-13T10:25:52Z) - A Question-centric Multi-experts Contrastive Learning Framework for Improving the Accuracy and Interpretability of Deep Sequential Knowledge Tracing Models [26.294808618068146]
Knowledge tracing plays a crucial role in predicting students' future performance.
Deep neural networks (DNNs) have shown great potential in solving the KT problem.
However, there still exist some important challenges when applying deep learning techniques to model the KT process.
arXiv Detail & Related papers (2024-03-12T05:15:42Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - LANA: Towards Personalized Deep Knowledge Tracing Through
Distinguishable Interactive Sequences [21.67751919579854]
We propose Leveled Attentive KNowledge TrAcing (LANA) to predict students' responses to future questions.
It uses a novel student-related features extractor (SRFE) to distill students' unique inherent properties from their respective interactive sequences.
With pivot module reconstructed the decoder for individual students and leveled learning specialized encoders for groups, personalized DKT was achieved.
arXiv Detail & Related papers (2021-04-21T02:57:42Z) - Context-Aware Attentive Knowledge Tracing [21.397976659857793]
We propose attentive knowledge tracing, which couples flexible attention-based neural network models with a series of novel, interpretable model components.
AKT uses a novel monotonic attention mechanism that relates a learner's future responses to assessment questions to their past responses.
We show that AKT outperforms existing KT methods (by up to $6%$ in AUC in some cases) on predicting future learner responses.
arXiv Detail & Related papers (2020-07-24T02:45:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.