Abstract: This paper presents an efficient model to predict a student's answer
correctness given his past learning activities. Basically, I use both
transformer encoder and RNN to deal with time series input. The novel point of
the model is that it only uses the last input as query in transformer encoder,
instead of all sequence, which makes QK matrix multiplication in transformer
Encoder to have O(L) time complexity, instead of O(L^2). It allows the model to
input longer sequence. Using this model I achieved the 1st place in the 'Riiid!
Answer Correctness Prediction' competition hosted on kaggle.