CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer
- URL: http://arxiv.org/abs/2406.10296v2
- Date: Tue, 18 Jun 2024 00:53:50 GMT
- Title: CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer
- Authors: Heeseok Jung, Jaesang Yoo, Yohaan Yoon, Yeonju Jang,
- Abstract summary: We propose cold-start mitigation in knowledge tracing by aligning a generative language model as a students' knowledge tracer (T)
We framed the KT task as a natural language processing task, wherein problem-solving data are expressed in natural language.
We evaluated the performance of the CLST in situations of data scarcity using various baseline models for comparison.
- Score: 1.6713666776851528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge tracing (KT), wherein students' problem-solving histories are used to estimate their current levels of knowledge, has attracted significant interest from researchers. However, most existing KT models were developed with an ID-based paradigm, which exhibits limitations in cold-start performance. These limitations can be mitigated by leveraging the vast quantities of external knowledge possessed by generative large language models (LLMs). In this study, we propose cold-start mitigation in knowledge tracing by aligning a generative language model as a students' knowledge tracer (CLST) as a framework that utilizes a generative LLM as a knowledge tracer. Upon collecting data from math, social studies, and science subjects, we framed the KT task as a natural language processing task, wherein problem-solving data are expressed in natural language, and fine-tuned the generative LLM using the formatted KT dataset. Subsequently, we evaluated the performance of the CLST in situations of data scarcity using various baseline models for comparison. The results indicate that the CLST significantly enhanced performance with a dataset of fewer than 100 students in terms of prediction, reliability, and cross-domain generalization.
Related papers
- SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model [64.92472567841105]
Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question.
Structure-aware Inductive Knowledge Tracing model with large language model (dubbed SINKT)
SINKT predicts the student's response to the target question by interacting with the student's knowledge state and the question representation.
arXiv Detail & Related papers (2024-07-01T12:44:52Z) - Language Model Can Do Knowledge Tracing: Simple but Effective Method to Integrate Language Model and Knowledge Tracing Task [3.1459398432526267]
This paper proposes Language model-based Knowledge Tracing (LKT), a novel framework that integrates pre-trained language models (PLMs) with Knowledge Tracing methods.
LKT effectively incorporates textual information and significantly outperforms previous KT models on large benchmark datasets.
arXiv Detail & Related papers (2024-06-05T03:26:59Z) - CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models [0.18416014644193068]
This paper introduces the Contextual Language model for Accurate Imputation Method (CLAIM)
Unlike traditional imputation methods, CLAIM utilizes contextually relevant natural language descriptors to fill missing values.
Our evaluations across diverse datasets and missingness patterns reveal CLAIM's superior performance over existing imputation techniques.
arXiv Detail & Related papers (2024-05-28T00:08:29Z) - LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements [59.71218039095155]
Task of reading comprehension (RC) provides a primary means to assess language models' natural language understanding (NLU) capabilities.
If the context aligns with the models' internal knowledge, it is hard to discern whether the models' answers stem from context comprehension or from internal information.
To address this issue, we suggest to use RC on imaginary data, based on fictitious facts and entities.
arXiv Detail & Related papers (2024-04-09T13:08:56Z) - Evolving Knowledge Distillation with Large Language Models and Active
Learning [46.85430680828938]
Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks.
Previous research has attempted to distill the knowledge of LLMs into smaller models by generating annotated data.
We propose EvoKD: Evolving Knowledge Distillation, which leverages the concept of active learning to interactively enhance the process of data generation using large language models.
arXiv Detail & Related papers (2024-03-11T03:55:24Z) - Discovery of the Hidden World with Large Language Models [100.38157787218044]
We introduce COAT: Causal representatiOn AssistanT.
COAT incorporates LLMs as a factor proposer that extracts the potential causal factors from unstructured data.
LLMs can also be instructed to provide additional information used to collect data values.
arXiv Detail & Related papers (2024-02-06T12:18:54Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - From Supervised to Generative: A Novel Paradigm for Tabular Deep Learning with Large Language Models [18.219485459836285]
Generative Tabular Learning (GTL) is a novel framework that integrates the advanced functionalities of large language models (LLMs)
Our empirical study spans 384 public datasets, rigorously analyzing GTL's scaling behaviors.
GTL-LLaMA-2 model demonstrates superior zero-shot and in-context learning capabilities across numerous classification and regression tasks.
arXiv Detail & Related papers (2023-10-11T09:37:38Z) - Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context
Learning [23.971206470486468]
We present a framework that converts relevant historical facts into prompts and generates ranked predictions using token probabilities.
Surprisingly, we observe that LLMs, out-of-the-box, perform on par with state-of-the-art TKG models.
We also discover that using numerical indices instead of entity/relation names, does not significantly affect the performance.
arXiv Detail & Related papers (2023-05-17T23:50:28Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.