Related papers: Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems

Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems

URL: http://arxiv.org/abs/2602.17542v1
Date: Thu, 19 Feb 2026 16:58:34 GMT
Title: Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems
Authors: Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan,
Abstract summary: We propose an automated framework to label KC-level correctness directly from student-written code.<n>We evaluate the resulting KC-level correctness labels in terms of learning curve fit and predictive performance.
Score: 0.4316506818580031
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-grained skill representations, commonly referred to as knowledge components (KCs), are fundamental to many approaches in student modeling and learning analytics. However, KC-level correctness labels are rarely available in real-world datasets, especially for open-ended programming tasks where solutions typically involve multiple KCs simultaneously. Simply propagating problem-level correctness to all associated KCs obscures partial mastery and often leads to poorly fitted learning curves. To address this challenge, we propose an automated framework that leverages large language models (LLMs) to label KC-level correctness directly from student-written code. Our method assesses whether each KC is correctly applied and further introduces a temporal context-aware Code-KC mapping mechanism to better align KCs with individual student code. We evaluate the resulting KC-level correctness labels in terms of learning curve fit and predictive performance using the power law of practice and the Additive Factors Model. Experimental results show that our framework leads to learning curves that are more consistent with cognitive theory and improves predictive performance, compared to baselines. Human evaluation further demonstrates substantial agreement between LLM and expert annotations.

Related papers

SkillGen: Learning Domain Skills for In-Context Sequential Decision Making [24.41349550520032]
We introduce SkillGen, a skill-based ICL framework for structured sequential reasoning.<n>We show that SkillGen achieves consistent gains, improving progress rate by 5.9%-16.5% on average across models.
arXiv Detail & Related papers (2025-11-18T17:09:21Z)
Automated Knowledge Component Generation for Interpretable Knowledge Tracing in Coding Problems [2.801976382946474]
Knowledge components (KCs) mapped to problems help model student learning, tracking their mastery levels on fine-grained skills.<n>We present an automated, LLM-based pipeline for KC generation and tagging for open-ended programming problems.<n>We find that KCGen-KT outperforms existing KT methods and human-written KCs on future student response prediction.
arXiv Detail & Related papers (2025-02-25T20:40:51Z)
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing [59.480951050911436]
We present KCQRL, a framework for automated knowledge concept annotation and question representation learning.<n>We demonstrate the effectiveness of KCQRL across 15 KT algorithms on two large real-world Math learning datasets.
arXiv Detail & Related papers (2024-10-02T16:37:19Z)
Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions [49.36683223327633]
Large Language Models (LLMs) encapsulate extensive world knowledge and exhibit powerful context modeling capabilities.<n>We propose a novel framework that synergizes the strengths of LLMs with robust knowledge representation to enable effective and efficient KGC.<n>We achieve a 47% relative improvement over previous methods based on non-fine-tuned LLMs and, to our knowledge, are the first to achieve classification performance comparable to fine-tuned LLMs.
arXiv Detail & Related papers (2024-08-13T10:15:55Z)
Overcoming Pitfalls in Graph Contrastive Learning Evaluation: Toward Comprehensive Benchmarks [60.82579717007963]
We introduce an enhanced evaluation framework designed to more accurately gauge the effectiveness, consistency, and overall capability of Graph Contrastive Learning (GCL) methods.
arXiv Detail & Related papers (2024-02-24T01:47:56Z)
Density Distribution-based Learning Framework for Addressing Online Continual Learning Challenges [4.715630709185073]
We introduce a density distribution-based learning framework for online Continual Learning. Our framework achieves superior average accuracy and time-space efficiency. Our method outperforms popular CL approaches by a significant margin.
arXiv Detail & Related papers (2023-11-22T09:21:28Z)
Complementary Labels Learning with Augmented Classes [22.460256396941528]
Complementary Labels Learning (CLL) arises in many real-world tasks such as private questions classification and online learning. We propose a novel problem setting called Complementary Labels Learning with Augmented Classes (CLLAC) By using unlabeled data, we propose an unbiased estimator of classification risk for CLLAC, which is guaranteed to be provably consistent.
arXiv Detail & Related papers (2022-11-19T13:55:27Z)
Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability. CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means. We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z)
Great Truths are Always Simple: A Rather Simple Knowledge Encoder for Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models [89.98762327725112]
Commonsense reasoning in natural language is a desired ability of artificial intelligent systems. For solving complex commonsense reasoning tasks, a typical solution is to enhance pre-trained language models(PTMs) with a knowledge-aware graph neural network(GNN) encoder. Despite the effectiveness, these approaches are built on heavy architectures, and can't clearly explain how external knowledge resources improve the reasoning capacity of PTMs.
arXiv Detail & Related papers (2022-05-04T01:27:36Z)
ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications. We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN) We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.