LMCD: Language Models are Zeroshot Cognitive Diagnosis Learners
- URL: http://arxiv.org/abs/2505.21239v1
- Date: Tue, 27 May 2025 14:19:35 GMT
- Title: LMCD: Language Models are Zeroshot Cognitive Diagnosis Learners
- Authors: Yu He, Zihan Yao, Chentao Song, Tianyu Qi, Jun Liu, Ming Li, Qing Huang,
- Abstract summary: Cognitive Diagnosis (CD) has become a critical task in AI-empowered education.<n>Recent NLP-based approaches leveraging pre-trained language models (PLMs) have shown promise.<n>We propose Language Models as Zeroshot Cognitive Diagnosis learners (LMCD)<n>Experiments on two real-world datasets demonstrate that LMCD significantly outperforms state-of-the-art methods in both exercise-cold and domain-cold settings.
- Score: 12.80627587335383
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cognitive Diagnosis (CD) has become a critical task in AI-empowered education, supporting personalized learning by accurately assessing students' cognitive states. However, traditional CD models often struggle in cold-start scenarios due to the lack of student-exercise interaction data. Recent NLP-based approaches leveraging pre-trained language models (PLMs) have shown promise by utilizing textual features but fail to fully bridge the gap between semantic understanding and cognitive profiling. In this work, we propose Language Models as Zeroshot Cognitive Diagnosis Learners (LMCD), a novel framework designed to handle cold-start challenges by harnessing large language models (LLMs). LMCD operates via two primary phases: (1) Knowledge Diffusion, where LLMs generate enriched contents of exercises and knowledge concepts (KCs), establishing stronger semantic links; and (2) Semantic-Cognitive Fusion, where LLMs employ causal attention mechanisms to integrate textual information and student cognitive states, creating comprehensive profiles for both students and exercises. These representations are efficiently trained with off-the-shelf CD models. Experiments on two real-world datasets demonstrate that LMCD significantly outperforms state-of-the-art methods in both exercise-cold and domain-cold settings. The code is publicly available at https://github.com/TAL-auroraX/LMCD
Related papers
- LLM4CD: Leveraging Large Language Models for Open-World Knowledge Augmented Cognitive Diagnosis [56.50378080174923]
We propose LLM4CD, which Leverages Large Language Models for Open-World Knowledge Augmented Cognitive Diagnosis.<n>Our method utilizes the open-world knowledge of LLMs to construct cognitively expressive textual representations, which are encoded to introduce rich semantic information into the CD task.<n>This approach substitutes traditional ID embeddings with semantic representations, enabling the model to accommodate new students and exercises with open-world knowledge and address the cold-start problem.
arXiv Detail & Related papers (2025-05-14T14:48:00Z) - Knowledge is Power: Harnessing Large Language Models for Enhanced Cognitive Diagnosis [12.936153018855649]
Cognitive Diagnosis Models (CDMs) are designed to assess students' cognitive states by analyzing their performance across a series of exercises.<n>Existing CDMs often struggle with diagnosing infrequent students and exercises due to a lack of rich prior knowledge.<n>With the advancement in large language models (LLMs), their integration into cognitive diagnosis presents a promising opportunity.
arXiv Detail & Related papers (2025-02-08T13:02:45Z) - Language Representation Favored Zero-Shot Cross-Domain Cognitive Diagnosis [15.006031265076006]
This paper proposes the language representation favored zero-shot cross-domain cognitive diagnosis (LRCD)<n>LRCD first analyzes the behavior patterns of students, exercises and concepts in different domains, and then describes the profiles of students, exercises and concepts using textual descriptions.<n>To address the discrepancy between the language space and the cognitive diagnosis space, we propose language-cognitive mappers in LRCD to learn the mapping from the former to the latter.
arXiv Detail & Related papers (2025-01-18T03:35:44Z) - VladVA: Discriminative Fine-tuning of LVLMs [67.14293827774827]
Contrastively-trained Vision-Language Models (VLMs) like CLIP have become the de facto approach for discriminative vision-language representation learning.<n>We propose to combine "the best of both worlds": a new training approach for discriminative fine-tuning of LVLMs.
arXiv Detail & Related papers (2024-12-05T17:54:27Z) - A Dual-Fusion Cognitive Diagnosis Framework for Open Student Learning Environments [10.066184572184627]
This paper proposes a dual-fusion cognitive diagnosis framework (DFCD) to address the challenge of aligning two different modalities.
Experiments show that DFCD achieves superior performance by integrating different modalities and strong adaptability in open student learning environments.
arXiv Detail & Related papers (2024-10-19T10:12:02Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Interactive Continual Learning: Fast and Slow Thinking [19.253164551254734]
This paper presents a novel Interactive Continual Learning framework, enabled by collaborative interactions among models of various sizes.
To improve memory retrieval in System1, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution.
Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods.
arXiv Detail & Related papers (2024-03-05T03:37:28Z) - FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Prompting Language-Informed Distribution for Compositional Zero-Shot Learning [73.49852821602057]
Compositional zero-shot learning (CZSL) task aims to recognize unseen compositional visual concepts.
We propose a model by prompting the language-informed distribution, aka., PLID, for the task.
Experimental results on MIT-States, UT-Zappos, and C-GQA datasets show the superior performance of the PLID to the prior arts.
arXiv Detail & Related papers (2023-05-23T18:00:22Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.