No Length Left Behind: Enhancing Knowledge Tracing for Modeling
Sequences of Excessive or Insufficient Lengths
- URL: http://arxiv.org/abs/2308.03488v1
- Date: Mon, 7 Aug 2023 11:30:58 GMT
- Title: No Length Left Behind: Enhancing Knowledge Tracing for Modeling
Sequences of Excessive or Insufficient Lengths
- Authors: Moyu Zhang, Xinning Zhu, Chunhong Zhang, Feng Pan, Wenchen Qian, Hui
Zhao
- Abstract summary: Knowledge tracing aims to predict students' responses to practices based on their historical question-answering behaviors.
As sequences get longer, computational costs will increase exponentially.
We propose a model called Sequence-Flexible Knowledge Tracing (SFKT)
- Score: 3.2687390531088414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge tracing (KT) aims to predict students' responses to practices based
on their historical question-answering behaviors. However, most current KT
methods focus on improving overall AUC, leaving ample room for optimization in
modeling sequences of excessive or insufficient lengths. As sequences get
longer, computational costs will increase exponentially. Therefore, KT methods
usually truncate sequences to an acceptable length, which makes it difficult
for models on online service systems to capture complete historical practice
behaviors of students with too long sequences. Conversely, modeling students
with short practice sequences using most KT methods may result in overfitting
due to limited observation samples. To address the above limitations, we
propose a model called Sequence-Flexible Knowledge Tracing (SFKT).
Related papers
- How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities [0.6798775532273751]
Recent advances in system engineering have enabled the scaling up of model that are purported to support extended context length.
We show that while such claims may be sound theoretically, there remain large practical gaps that are empirically observed.
arXiv Detail & Related papers (2024-07-11T01:08:39Z) - CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling [52.404072802235234]
We introduce Chunked Instruction-aware State Eviction (CItruS), a modeling technique that integrates the attention preferences useful for a downstream task into the eviction process of hidden states.
Our training-free method exhibits superior performance on long sequence comprehension and retrieval tasks over several strong baselines under the same memory budget.
arXiv Detail & Related papers (2024-06-17T18:34:58Z) - Long Range Propagation on Continuous-Time Dynamic Graphs [18.5534584418248]
Continuous-Time Graph Anti-Symmetric Network (CTAN) is designed for efficient propagation of information.
We show how CTAN's empirical performance on synthetic long-range benchmarks and real-world benchmarks is superior to other methods.
arXiv Detail & Related papers (2024-06-04T19:42:19Z) - Mitigating Catastrophic Forgetting in Task-Incremental Continual
Learning with Adaptive Classification Criterion [50.03041373044267]
We propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning.
Experiments show that CFL achieves state-of-the-art performance and has a stronger ability to overcome compared with the classification baselines.
arXiv Detail & Related papers (2023-05-20T19:22:40Z) - HiPool: Modeling Long Documents Using Graph Neural Networks [24.91040673099863]
Long sequences in Natural Language Processing (NLP) are a challenging problem.
Recent pretraining language models achieve satisfying performances in many NLP tasks.
We propose a new challenging benchmark, totaling six datasets with up to 53k samples and 4034 average tokens' length.
arXiv Detail & Related papers (2023-05-05T06:58:24Z) - Effective and Efficient Training for Sequential Recommendation using
Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective.
We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z) - FiLM: Frequency improved Legendre Memory Model for Long-term Time Series
Forecasting [22.821606402558707]
We develop a textbfFrequency textbfimproved textbfLegendre textbfMemory model, or bf FiLM, to handle the dilemma between accurately preserving historical information and reducing the impact of noisy signals in the past.
Our empirical studies show that the proposed FiLM improves the accuracy of state-of-the-art models by a significant margin.
arXiv Detail & Related papers (2022-05-18T12:37:54Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS)
Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.